Lair Of The Multimedia Guru

2008-11-22

Pseudo random number generators

I guess most developers have once or more than once run into the question of, “which PRNG should one use”. The awnser, or more precissely my awnser to this, would be it really depends on for what one plans to use it.
There is no perfect generator, there are fast ones, and there are good ones, but the best arent the fastest. One has to choose depending on ones needs

  • If one wants to generate white noise for a human listener then the best generator simply is the fastest that produces perfectly sounding white noise, it really doesnt matter if it has some statistical defects or not.
  • If one wants to use the random numbers for cryptographic purposes, a perfect generator that has been extensively tested and no flaws found is needed.
  • If one wants to test a scientific theory by simulation, one needs a generator that doesnt cause a wrong result from the simulation, but as the correct result isnt known, one cant easily pick based on this criteria. Thus one has little alternative to picking a generator that has few statistical defects and is fast enough for the amount of numbers needed. Or better even run the simulation 2-3 times with very different generators

So which generators are there and what defects do they have? This actually is rather easy to awnser, or then maybe not ;) well, there is George Marsaglias Diehard and Pierre L’Ecuyers
TestU01 both contain code to test PRNGs, later also contains a paper describing the results of these tests for most recent and popular PRNGs.

In an ideal world one would just have to look at the TestU01 paper And pick a generator that has the amount of defects and speed one wants. At least thats what i thought before testing a few generators myself, more precissely i took a few of the generators that passed all tests in TestU01 and run 2 (to me obvious) tests against them

  1. Use a gaussian like algorithm to find out if any bits in the output are just a linear (mod 2) combination of previous output bits
  2. Use a gaussian like algorithm to find out if any scalars in the output are just a linear (mod maxoutput+1) combination of previous output scalars
  3. Use a gaussian like algorithm to find out if any bits (only considering the least significant of each scalar) in the output are just a linear (mod 2) combination of previous such output bits

The first 2 of these tests in the way i implemented them only consider outputs surrounding a scalar with 20 zero LSBs, though as far as i tested it this 20bit trick only made a difference for the additative LFG with only the 32MSB used, but i did not test all without this 20bit check

Failing either of these tests makes the PRNG at least unsuitable for linear algebra under the same modulo, so its certainly a statistical defect. All linear congruential generators have to fail at least the test that works in their own modulo, after all they are linear, but actually all i tested failed both tests. Lagged fibonacci generators based on addition or xor similarly must fail and do. All multiplicative Lagged fibonacci generators fail as well due to linear dependancies in their lower bits. All linear feedback shift register based generators like the mersene twister very obviously have to fail too, i did not test any of them though as none passed TestU01.

The actual generators that passed TestU01 and failed mine where:
superduper64, Marsa-lfib4, DX-47-3, MRGk5_93, Lfib(2^64,55,24,*), brent-xor4096s

The actual generators that passed TestU01 and passed mine where:
brent-xor4096s, MRG31k3p, CombMRG96, ran2, CLCG4, KISS99

Note1, there are more generators that passed TestU01 but i did not test them

Note2, The first 2 tests where only searching for linear dependancies within 512 consecutive scalars, that is 32kbit for a 64bit PRNG, the 3rd test was considering 32kbit of LSBs

In addition to that, limiting the output to just 32MSB of 64 made the multiplicative LFG 2^64,55,24,* pass mine while the same trick with a additative one was not helping, KISS99 also passed when SHR3 or CONG was droped but not when both or MWC alone was droped

No source code this time though because i randomly copied PRNGs from various places, but if there is interrest i could throw all but my own code out and post that

To be continued when iam less tired … ;)

Filed under: Cryptanalysis,Pseudo random number generators — Michael @ 01:56

10 Comments »

  1. This is certainly the case of “good is the enemy of good enough”. TAOCP vol 2 pays a lot of attention to the criteria of randomness but it boils down to “good enough for your task” too.

    And just a known fact – from a part of LFSR generator sequence you can restore its parameters (IIRC, Berlekamp provided an algorithms back in 60-70s), for LGC it’s even more obvious. Probably the best thing for you is too look at cryptographic generators. Even simple (and slow) Blum-Blum-Shub should be fine for you ;).

    Comment by Kostya — 2008-11-22 @ 07:25

  2. > And just a known fact – from a part of LFSR generator sequence you can restore its parameters
    > (IIRC, Berlekamp provided an algorithms back in 60-70s),

    yes, and berlekamp is faster than the gaussian elimination iam using but later is more flexible, as it also works for cases where the “predicted from” bits arent exactly the next “predicted” bits.

    > for LGC it’s even more obvious. Probably the best thing for you is too look at cryptographic
    > generators. Even simple (and slow) Blum-Blum-Shub should be fine for you ;).

    if i was searching for a PRNG for crypto i surely would not pick one based on TestU01 and some of my own tests ;)
    Anyway, another of them (brent-xor4096) failed another linear prediction test, more precissely a 32768×65536 matrix of bits (that being only 1 lsb from each scalar) just has a rank of 5110, while for random bits rank=32768 would be expected.

    Comment by Michael — 2008-11-22 @ 11:19

  3. > > And just a known fact – from a part of LFSR generator sequence you can restore its
    > > parameters (IIRC, Berlekamp provided an algorithms back in 60-70s),

    > yes, and berlekamp is faster than the gaussian elimination iam using but later is more
    > flexible as it also works for cases where the “predicted from” bits arent exactly the
    > next “predicted” bits.

    Btw, TestU01 according to the paper does use berlekamp-massey as one of their tests

    Comment by Michael — 2008-11-22 @ 11:24

  4. Can you classify and test this PRNG?

    Comment by Ribeiro Alvo — 2009-08-27 @ 10:52

  5. Hi. Can you point me to code for your tests, as well as a discussion of the statistical theory of the tests? I would like to understand them better.
    Thanks

    Comment by Paul Leopardi — 2010-01-05 @ 00:47

  6. Hi Paul.

    The Diehard battery of tests:
    http://stat.fsu.edu/pub/diehard/

    The ENT program:
    http://www.fourmilab.ch/random/

    About the discussion of the statistical theory of the tests, you can e-mail me: ribeiroalvo(at)sapo.pt

    Or if you prefer start a tread at sci.crypt forum.

    Thanks

    At this moment, the algorithm is in analisys by Peter Helekallek form the Plab project : http://random.mat.sbg.ac.at/

    Comment by Ribeiro Alvo — 2010-01-14 @ 02:08

  7. Did you ever publish the code of your 3 “gaussian-like” tests?
    Any test even more stringent than BigCrush is of great interest to me – and I daresay to many others as well.

    Comment by Johannes Baagoe — 2010-04-11 @ 18:38

  8. > Did you ever publish the code of your 3 “gaussian-like” tests?

    ive just commited what i found on my disk to
    svn://svn.mplayerhq.hu/michael/trunk/randi

    I dont know for certain though if this was exactly the code i used or the last version …

    Comment by Michael — 2010-04-14 @ 19:31

  9. Thanks a lot. I have checked it out.

    I intend to try to turn it into a more general tool that simply reads its standard input, which is assumed to be the output of any arbitrary PRNG. One just has to specify whether the input is bits or IEEE doubles in the range [0, 1[.

    That is what I use with TestU01, Diehard and others. I find it a lot handier than incorporating one generator after another into the code, compiling, etc.

    Do you see any problems with that approach?

    Comment by Johannes Baagoe — 2010-04-18 @ 01:51

  10. From a part of LFSR generator sequence you can restore its parameters (IIRC, Berlekamp provided an algorithms back in 60-70s), for LGC it’s even more obvious. Thebest thing for you is too look at cryptographic generators.

    Comment by Clarence Reardon — 2011-03-26 @ 22:34

RSS feed for comments on this post.

Leave a comment

Powered by WordPress