ID:1352911
 
Resolved
The random number generator used by games has been upgraded to a Mersenne Twister. Old .dmb files will still use the old PRNG, in case they depended on rand_seed() to produce predictable values.
Applies to:Dream Daemon
Status: Resolved (500.1205)

This issue has been resolved.
The current RNG is EXTREMELY dodgy, very biased towards lower numbers, groups them and its generally [not that good]

Changing it to something like a marsenne twister algorithm should at least improve it slightly, it cant get much worse
Personally, I'd like to see the ability to initialize more than one RNG at a time. Having multiple players' events operating off of one seed can violate the reliability of even the most statistically accurate of pseudo-random number generators.
Could you elaborate? I've not looked into it very closely, but I've never seen any issues with the RNG.



Doesn't show any grouping that I can see, for example. Code used to generate that:

turf
icon = 'test.dmi'
icon_state = "blank"

proc/DrawPixAt(x, y)
var/tx = x / 32 + 1
var/ty = y / 32 + 1

var/ix = x % 32 + 1
var/iy = y % 32 + 1

var/turf/t = locate(tx, ty, 1)
var/icon/i = icon(t.icon)
i.DrawBox(rgb(0, 0, 0), ix, iy)
t.icon = i

mob/verb/test()
for (var/i = 0 to 1000) DrawPixAt(rand(0, 319), rand(0, 319))


where 'test.dmi' has an icon state "blank" that's just white.
Heres just the output of prob(50) (a for true, b for false)
b
a
b
b
b
a
a
b
b
a
a
a
b
b
a
b
b
a
b
a
b
b
a
a
a
a
a
a
a
a
a
a
b
a


10 in a row has 0.19% chance of happening, by the way, but some patterns stand out pretty well
Flipping a coin 10 times and getting 10 of the same is 1/2^9, yes, but flipping a coin 34 times and getting 10-in-a-row is much more probable.

Consider flipping a coin 11 times and getting 10 in a row.

You can flip [a*10]a, [a*10]b, [b*10]a, [b*10]b, a[b*10], b[a*10] to get a hit. That's 6 possible matches. There are 2^11 different ways of flipping 11 coins, so 6/2^11 = 0.29%, 1.5 times as likely.

Consider 12 times and getting 10 in a row.

[a*10]*4 (aa, ab, ba, bb). [b*10]*4. a[b*10]*2. b[a*10]*2. aa[b*10]. ab[a*10]. ba[b*10]. bb[a*10]. 16 possibilities. 16 / 2^12 = 1 / 2^8 = 0.390625%, 2 times as likely.

I don't think your data there is a smoking bullet disproving randomness. Just because a sequence doesn't look 'random' to eye doesn't mean it's nonrandom. Do the statistics, please.

EDIT: Some code to check the distribution of prob(50):
mob/verb/test()
var/tot = 0
var/samples = 10000
for(var/i = 1 to samples) if (prob(50)) ++tot

var/p = tot / samples
var/z = 3 //99.7%
var/e = z / (2 * sqrt(samples))
world << "[p - e] | [p + e] [(p - e > 0.5 || p + e < 0.5)?"BAD":"OKAY"]"


Z-parameter is number of standard deviations. Obviously doesn't check for patterned-ness, but it does tell you whether or not it's 50/50. It is.

EDITEDIT: And some similar code to check distributions of 'ab', 'ba', 'aa', 'bb' had 0.25 inside the confidence interval at 99.7% too.
Notice how almost everything is grouped like 'bb' or 'aa', singles occur rarely, I ran that a bit further and kept getting the same pattern

Also you'll notice some strange things, like in SS13, wizard rounds almost always come in pairs or triplets, its the first use of the RNG of the round, and it should be seeded with time, it only seems to come outside pairs if people happen to press the randomize character button at the lobby, I'm gonna try to figure out what the hell is up with that
I'm not a statistican, I cant write up magical code that proves it wrong, but pretty much anyone should tell you its dodgy, considering the current is the libc rand function anyways (I think), the reason its there is because simple statistics say its good, but looking at the output you will always see horrible amounts of patterns that shouldent be happening
Seriously, ask any SS13 dev, the game relies a lot on the RNG, and they have to put up with that thing a lot, I think there was even some code for a while to make the traitor selection fair because someone figured out how to get picked nearly every single time by just ending up in the right place in the list they were pick()'ed from
Is the issue on windows or linux or both? They use different rngs. I would think they'd both have long enough periods for games but maybe we have a bug. It'd be easy to switch although I suspect people would still complain.
Possibly, it might be a linux only thing though, which would explain why I keep getting pretty good results on windows
I can name one of the pseudo-random number generators that is relatively known for missing numbers. The original 'rand' function from the 'C' standard libraries was known for missing several numbers due to the algorithm it used. (Not sure if it applies to all implementations of it though.)
While the generator itself might not be bad, I've always felt that seeding the generator should have more of a random effect. Maybe I'm misunderstanding how it works though.
This:
mob
verb
CreateIcon()
var/icon/I = new('test.dmi')
for(var/i in 1 to 320)
rand_seed(i)
var/height = 100*rand()
I.DrawBox(rgb(255,0,0), i, 1, i, height)
usr << ftp(I)
src.icon = I


Creates this:


Even though the seeds are iterated like that, I still thought (wish) they would generate a more random pattern.
(test.dmi is just a 320x100 icon with a blank icon state)
Seeding itself is going to be at the mercy of the overall algorithm when it comes to results.

Changing the seed and only using the first value is going to give you very regular results because you are using the same algorithm to acheive those results.
I wrote my own (limited) LCRNG a long time ago (for a project long since abandoned...) which didn't show nearly that level of regularity. It was a pain in the ass to find primes that would work even reasonable well, and the period was pretty limited. I'm pretty sure I was using it for random terrain generation, seeding in some combination of x and y turf coordinates. Looked a heck of a lot nicer than that though ;)
I'm a little confused as to why the seed is reinitialized each time here. What happens if you remove the rand_seed() line? Do we get a pretty random result?

That said, I am surprised the seed has such a visible effect on the first result.
I was trying to dynamically generate terrain, and be able to recreate it as necessary. Was years ago, and I'm not sure exactly what I was doing, but I was plugging some combination of an x,y coordinate into rand_seed, popping out a random number, and using that to set a turfs terrain type.

I've been using a mixture of Perlin/Simplex noise now. Though come to think of it, I think I wrote an LCG for that as well before going with a random vector matrix.

And yeah, it looks pretty random if you take that out.
Ok, well it should be simple enough to switch to something better, with the advantage that we can use the same algorithm for all platforms. Right now windows uses rand() (which is known to have some periodic issues) and unix uses rand48() (I think better). This was all from years ago and there are better implementations now. I'm not sure why the first number off the seed is so predictable.. that seems odd to me and maybe an issue with our implementation.
Again, I'd recommend a marsenne twister, its pretty easy to implement, but can have bad results if improperly initialized
http://en.wikipedia.org/wiki/Mersenne_twister#Pseudocode
I'm good with anything, but I'd really, really like to see the ability to seed multiple generators and store them in an object for later use.
If you do reimplement the RNG instead of using CRT-supplied RNG, I second the Mersenne Twister as a suggestion. A mechanism for keeping separate instances of the RNG with different seeds would be good too.

My impression of rand48 vs rand is that rand48() provides a /specific/ linear congruential generator, whereas rand() is implementation-defined - basically if you call rand48() you know what you're getting.

Notice how almost everything is grouped like 'bb' or 'aa', singles occur rarely, I ran that a bit further and kept getting the same pattern

I did the statistical test for pairs, and 0.25 was inside the range with confidence 99.7%. That doesn't tell you whether or not they appear in a patterned way (aaabbabbaaabbabb to infinity would pass that test, for example), but it does mean that any particular pair doesn't appear to be overrepresented. This is on Windows 7, fwiw.

The rand_seed behaviour doesn't surprise me too much. You'll get similar output from rand48(), or any other LCG. The values of a*x + c mod m from x = 0 to whatever are going to be patterned. See http://en.wikipedia.org/wiki/Linear_congruential_generator
How about giving C++11's RNG a try? Though I haven't tried it myself, it provides the Mersenne Twister algorithm (though to gain standard support for it, you may need to find which IDEs/Compilers support the algorithm)..
Page: 1 2