Lair Of The Multimedia Guru

2024-03-17

Correlations in randomly initialized PCG-DXSM

In the previous blog entry I showed many bad correlations in PCG-DXSM, but the cases had always at least 2 values non random either a seed and an increment or 2 increments. And even though one could create more than a septillion of these for each seed/increment. Thats not how PCG-DXSM should be used.

So the natural question is, can this happen if PCG-DXSM is used how its supposed to, that is initialized with strongly random seed+increment.

The short awnser is, given the right circumstances, yes.
The following text is structured like this:

Some basic and known concepts about LCGs to better illustrate their structure
Summary of what we currently know about the correlations
Some actual tests and software to quantify these specific correlations
A hypothetical example showing that this can be a real issue under the right circumstances
A workaround
A Solution
Some final words
More issues

Some basic and known concepts about LCGs to better illustrate their Structure

In the algebraic structure of LCGs with identical multipliers and operations mod 2^m
LCG(v₀, increment) can be seen as a vector whos elements are defined by v_i = v_i-1 * multiplier + increment (mod 2^m) we can add 2 vectors and thus LCGs. LCG(seed1, increment1) + LCG(seed2, increment2) = LCG(seed2+seed1, increment2+increment1) from this we can immediately see that the associative and kommutative law holds and that there is a inverse and identity element.
Thus LCGs with identical multipliers form an abelian group. The identity element is LCG(0,0)_i = 0. For each constant vector v_i = C theres a corresponding LCG(C, C*(1 – multiplier)) the increment of such “bad” LCGs is even when the multiplier is odd. It is thus common to disallow even increments. None the less even when they are disallowed in an implementation, they do not disappear from the algebraic structure. Thats because if one allows only odd numbers
the difference between 2 odd numbers is still even. If we looked at LCGs as polynomials instead of vectors then we could furthermore introduce a multiplication by x mod x^{2^M}-1 to represent a cyclic shift of all the elements of one vector. This would allow even more powerful tools to be used. But we don’t do this currently in this text. What we now have is basically a very small set of “primitive” LCGs that can then be offseted by a constant
LCG(C,I + C*(1 – multiplier)) = LCG(0,I) + LCG(C, C*(1 – multiplier)) the number of these primitive LCGs depend on the number of factors of 2 in 1 – multiplier. for PCG-DXSM we have 2 factors of 2 but odd increments are disallowed so we have 2 such primitive vectors which is all the diversity that exists in this LCG, all other vectors are just offseted by a constant from them. Looking at this you might ask if there are 2 LCG outputs that have a 128bit constant added how do these 2¹²⁹ cases fit in a 127bit “odd” increment. They do because 4 live as cyclic shifts in each vector.

Summary of what we currently know about the correlations

In the previous section we have shown that we can offset the LCG output by a constant and that all LCG outputs basically are offseted from one of 2 sequences by a constant. Or if one considered the 4 cyclic shifts too then really only half would be unique, its a matter of viewpoint.

Offsets in the low half

In DXSM the LCG output is split in 2 64bit words the low one or-ed by 1 so its always odd and the high one passed through xorshift, multiply, xorshift
before being multiplied with the low.
If we look at 2 LCGs that are offseted by D, so that only the low half is offsetted but the high half stays the same. This can be represented (in C style notation) as:
(L[i] | 1)*H[i] vs. ((L[i] + D) | 1)*H[i]
both (L[i] | 1) and ((L[i] + D) | 1) are odd as they have explicitly an 1 or-ed in, this can be written also as
(2X[i] + 1) * H[i] vs. (2Y[i] + 1) * H[i]
their difference
(2X[i] + 1)*H[i] – (2Y[i] + 1)*H[i] = (2X[i] + 1 – 2Y[i] – 1)*H[i] = 2(X[i] – Y[i])*H[i]
as can be seen their difference is even thus their lowest bit is correlated, that is not to say higher bits are not
Here we now have established that instead of 2¹²⁷ uncorrelated PCG-DXSM streams there are at best 2⁶⁴

Collisions in the first xor

The first xor mixes the high 32bit of the high half into the low 32bit of the high half, next the high half is multiplied by the constant multiplier and the high quarter xored into the low quarter before the final multiplication with the low half of the LCG. It is important to recognize that if, after the first XOR the low 49bits of the high side of the LCG match between 2 generators then we will have identical low bits after DXSM. Consider
MSB [15bit random][17bit 0][15bit random][17bit 0][64 bit random] LSB
The first xor stage will effectively do nothing to this. The correlation after that then depends on the low 15 random bits which is the only difference there is between the two LCGs which can affect the LSB from the high half. While the low half is orred by 1 so it is bascially a constant. If this on its own doesnt already fail statistical tests. Then this can be made more likely by observing that if we misalign 2 LCGs by a multiple of 2^128-b the only affected output will be in the highest b bits. So we can introduce entropy in the highest bits at will to tweak the high quarter and bring the 2 into a bad correlated state. The high quarter itself is outside the 49bits affecting the LSB output other than by the xor. It is also observed that increasing the amount of random bits beyond 15 does not fix this issue. It just requires more effort to find correlations. For example using 19bits, 65536 streams, 1024 stream alignments and 4 billion samples per stream, we still find statistical anomalies.

Collisions in the 2nd xor

It is possible to introduce a delta in the low half of the high half of the LCG, this will be unaffected by the first xor and then multiplied by the multiplier, this operation is algebraically well understood and it is possible to introduce deltas here which tend to produce poor statistics after the 2nd XOR on the low side. Such a case was also demonstrated toward teh end of my last blog entry, but no attempt was made to exploit this in the software here

Some actual tests and software to quantify these specific correlations

Combining the 2 cases from above with 19bits instead of 15 we can take 16k random increment,seed pairs and find a pair of vectors that are correlated if they are aligned. I cannot show correlation directly on fully random values due to myself only having my home computer for this. Also I do take 64k instead so we have over a dozend pairs to work with. At this point we align one of the seeds to the other, then shift it by a random 2^128-19 multiple and check the correlation for 1024 shifts. For each shift we generate a stream of 2³² values. Each tested stream will have its least significant bits interleaved or xored and the occurrence of each even 2-18 bit pattern checked based against the binomial distribution. Both the xor and interleaving sometimes catch separate cases so either alone could be used for slightly fewer cases found. Also for XOR mainly 6 and 8 bits matter while for interleaving the larger ones catch things. In the test run we see a pair that showed correlations at 15 different alignments of the 1024 tested. Given these 1024 represent a random 1/512 of the searchspace we consider. We expect ~8000 in the whole space for that pair. Other pairs had similar numbers so theres no reason to believe this was a outlier. Combining all this there seems to be a maximum of ~29 bits of diversity in the increments and about 115 bits in the seed. We can also see that many of the found correlations have multiple independent patters that are failing tests
Source code is here

Stage 1, filling set with random increments
set size:65536, stream_length:4294967296, top bits:19
Stage 2, Sorting streams
Stage 3, Comparing streams
pairs to consider: 37
checking pair with DELTA: 0xE5B05FFFA33E1FFF ED6E8B7EA831BA19 si0: 0x4B1D7EBE754531A6 03D07FC29934A58B,0xEC1A098AE7F3EE1C F0C3F7E28050E655 si1: 0x656D1EBED20711A6 1661F443F102EB72,0xCFC64EE4226C353F 911E2DDFC6D557E9
clean table 32 18
fill table 32 18
clean table 32 16
fill table 32 16
clean table 32 14
fill table 32 14
clean table 32 12
fill table 32 12
clean table 32 10
fill table 32 10
clean table 32 8
fill table 32 8
clean table 32 6
fill table 32 6
clean table 32 4
fill table 32 4
clean table 32 2
fill table 32 2
 3B8:  DELTA: 0xE3DF5FFFA33E1FFF ED6E8B7EA831BA19 -6    6(  67044071)0.000000008
checking pair with DELTA: 0x2FC51FFF99D95FFF B54A758B34402CFA si0: 0x6AF7AD68373F1ECB 388F5B99D012EDC8,0x3E41C67901EE75A1 DDAC6EB90EFCDF53 si1: 0x3B328D689D65BECB 8344E60E9BD2C0CE,0xD540FAF60FD83D88 C4D70A7957646F1B
checking pair with DELTA: 0x069F600086C96000 0EB818571D7DCAD0 si0: 0x75FED5AD724E4D47 09756BA7A9BB3730,0x11E1922B5FBB1E8E 4F7311398EE8460F si1: 0x6F5F75ACEB84ED46 FABD53508C3D6C60,0x9AAEFF3A5B76A3D6 3EC2E876DEA2604F
checking pair with DELTA: 0xA6DBFFFFA9631FFF ECCA24CFB00F1167 si0: 0x642B9E142D73D47D 758CA1A36D0AED70,0xEA0B9F70E0FE5C84 59DF4DCDE2F4727F si1: 0xBD4F9E148410B47D 88C27CD3BCFBDC09,0x893A4A4B113A5A3C 205C596A9D7316EB
checking pair with DELTA: 0xA4107FFF2C3EC000 26C54851149E853A si0: 0xE76E9BF9B19A825A 6A986886330B2129,0x1090288CC2863F81 D609776DF7909853 si1: 0x435E1BFA855BC25A 43D320351E6C9BEF,0xCD67E2DED5A42A3C 1905C3FB9FE4351B
  AE:  DELTA: 0xEB3EBFFF2C3EC000 26C54851149E853A 12  310(   1057180)0.000000015  12  290(   1056853)0.000000211  12  f10(   1056627)0.000001252  12  1a0(   1040569)0.000001519  12  6e0(   1056277)0.000017950  12  690(   1041143)0.000113557  12  560(   1055904)0.000270395  12  da0(   1041278)0.000299401
 32E:  DELTA: 0x5DEEBFFF2C3EC000 26C54851149E853A 12  5c0(   1038624)0.000000000  12  1c0(   1057931)0.000000000  12  ec0(   1057333)0.000000004  12  ac0(   1040326)0.000000223  12  dc0(   1056653)0.000001023  12  6c0(   1041181)0.000149446  12  2c0(   1055845)0.000410364  12  9c0(   1041636)0.003602710
 2AE:  DELTA: 0x46FEBFFF2C3EC000 26C54851149E853A 12  4d0(   1060953)0.000000000  12  610(   1062490)0.000000000  12  760(   1060287)0.000000000  12  7a0(   1058928)0.000000000  12  8d0(   1063077)0.000000000  12  a10(   1062087)0.000000000  12  b60(   1062943)0.000000000  12  ba0(   1061416)0.000000000
 22E:  DELTA: 0x300EBFFF2C3EC000 26C54851149E853A 12  670(   1057881)0.000000000  12  740(   1039293)0.000000000  12  580(   1040335)0.000000239  -6    c(  67168834)0.000000783  12  570(   1056351)0.000010320  -6   2c(  67053163)0.000034817  12  f40(   1056076)0.000078670  12  d70(   1041141)0.000111924
checking pair with DELTA: 0x50007FFF5C726000 B744241DE30B69D1 si0: 0x6B787EBE72991EEA 8289751974A2A05F,0x0BBD42A86319A4D6 1545348A20D29269 si1: 0x1B77FEBF1626BEE9 CB4550FB9197368E,0xA41D07785110511F 469B72942EA5D15D
checking pair with DELTA: 0x4AC8DFFF76E85FFF 5317B9650AE16599 si0: 0x87FB5321F119AC7A C56465440DFD6BBB,0x1BC3CB3CF5D7B874 DACCF5D5C924B5AF si1: 0x3D3273227A314C7B 724CABDF031C0622,0x07EE1FA4549F8160 7A648121E3A1BD43
checking pair with DELTA: 0x0D78BFFFF0FCC000 9F272846B576856A si0: 0x114FA1E4F4E52E77 6CF38EB5E44CC81B,0x79DA1F7A3E715F59 889363DD36B59C95 si1: 0x03D6E1E503E86E76 CDCC666F2ED642B1,0x4998AE764CF9EA36 A31FDF6D2069DB1D
 2F9:  DELTA: 0xF300DFFFF0FCC000 9F272846B576856A 1834a80(     17652)0.000000000  16  540(     68674)0.000000000  16  a80(     68749)0.000000000  16 1740(     68320)0.000000000  16 1a40(     68301)0.000000000  16 2740(     68147)0.000000000  16 2a40(     68350)0.000000000  16 3540(     68046)0.000000000
 3F9:  DELTA: 0xA420DFFFF0FCC000 9F272846B576856A 14 2000(    266373)0.000000215  16 3000(     63697)0.002678040  14    0(    258663)0.011878566  16 8000(     63794)0.041411834  14 1000(    265545)0.041734621
checking pair with DELTA: 0xBBE320004FF76000 82454C2874A21EF3 si0: 0x7BBAE3DB04A0B4DE 5D7DAF8022BF2D34,0xB6D14DEA456CF025 AF3C8E5BC500FBDD si1: 0xBFD7C3DAB4A954DD DB386357AE1D0E41,0x2E36E70296EAF807 09177110A56946B9
 109:  DELTA: 0xA338C0004FF76000 82454C2874A21EF3 -6   24(  67168303)0.000001275  -6    4(  67058944)0.003937258
 309:  DELTA: 0x0478C0004FF76000 82454C2874A21EF3 16 1000(     68200)0.000000000  16 6000(     67912)0.000000000  16 9000(     68422)0.000000000  16 a000(     68296)0.000000000  16 e000(     68322)0.000000000  14 1000(    271534)0.000000000  14 2000(    272358)0.000000000  14    0(    247561)0.000000000
 189:  DELTA: 0x3B88C0004FF76000 82454C2874A21EF3 14 3523(    266414)0.000000110  14  523(    266255)0.000001424  14 17a3(    266247)0.000001616  14 3e23(    266166)0.000005727  14  52f(    266127)0.000010440  -8   e8(  16806671)0.000011400  14 34a3(    266115)0.000012545  14  223(    266025)0.000048894
 389:  DELTA: 0x9CC8C0004FF76000 82454C2874A21EF3 14 239c(    265963)0.000122652  -8   f8(  16749417)0.000200331  14  f9c(    265926)0.000210908  14 3cac(    265900)0.000307758  -8   38(  16803599)0.002137644  -8   78(  16751532)0.006391691  14   ac(    258625)0.007053808  14 1c6c(    265614)0.016667751
checking pair with DELTA: 0x61F81FFFD0A1A000 0EDB22A63E0AD996 si0: 0x1429D0EE8B94E764 E8B02AFD0694AD82,0x17D8D7C4708639FC F80DBC31B69114A5 si1: 0xB231B0EEBAF34764 D9D50856C889D3EC,0x79A6C972B0378F58 9787503C7F7BA21D
 12C:  DELTA: 0x41639FFFD0A1A000 0EDB22A63E0AD996 14  82c(    267011)0.000000000  14 1bdc(    257159)0.000000000  14 282c(    257273)0.000000000  14 275c(    257468)0.000000000  14 342c(    266806)0.000000000  14 182c(    257848)0.000000050  14 27dc(    257918)0.000000159  14  75c(    266276)0.000001021
  2C:  DELTA: 0xBCC39FFFD0A1A000 0EDB22A63E0AD996 14  65c(    257898)0.000000114  14 19ac(    266391)0.000000161  14 15ac(    266117)0.000012167  14 29ac(    265981)0.000094048  14 332c(    258649)0.009809622  14 365c(    258698)0.019104299  14 3a5c(    258709)0.022160153
 2EC:  DELTA: 0x697B9FFFD0A1A000 0EDB22A63E0AD996 14 202b(    265734)0.003239245
 32C:  DELTA: 0x4AA39FFFD0A1A000 0EDB22A63E0AD996 14 20ac(    266407)0.000000124  14 2fec(    258349)0.000135828  14 3e6c(    258354)0.000146280  14 10dc(    258396)0.000271643  14 30dc(    265864)0.000517196  14 2f2c(    258539)0.002125153  14  aec(    265729)0.003471746  14 3fec(    265693)0.005703141
 2AC:  DELTA: 0x88539FFFD0A1A000 0EDB22A63E0AD996 16 2c00(     67872)0.000000001  16 ec00(     67853)0.000000001  16 9c00(     63245)0.000000001  16 ac00(     63262)0.000000002  16 6c00(     63316)0.000000014  16 5c00(     63339)0.000000031  16 1c00(     67614)0.000003283  16 dc00(     67613)0.000003389
 22C:  DELTA: 0xC6039FFFD0A1A000 0EDB22A63E0AD996 14  56c(    266846)0.000000000  14  3ec(    266323)0.000000482  14  96c(    266299)0.000000708  14 105c(    258276)0.000045522  14 13dc(    258308)0.000073692  14 3a9c(    265995)0.000076435  14  c1c(    265955)0.000137963  14  a9c(    265801)0.001268171
checking pair with DELTA: 0xAEE04000493CBFFF BD570161F1BB9245 si0: 0x3232CB36E1EE4D2A A6E5C02AEC015417,0x5D4AADB8CE05CB16 60BFF633BE1922B7 si1: 0x83528B3698B18D2A E98EBEC8FA45C1D2,0x7023209E9838B9CB EDC7AD26E9D4B33B
  FA:  DELTA: 0x91C80000493CBFFF BD570161F1BB9245 16    0(     67945)0.000000000  14 1270(    267234)0.000000000  12   80(   1061964)0.000000000  12  1b0(   1059972)0.000000000  12  270(   1062824)0.000000000  12  34c(   1059060)0.000000000  12  40f(   1058356)0.000000000  12  540(   1062857)0.000000000
checking pair with DELTA: 0xF1602000180CDFFF DDB3EEC70873FC23 si0: 0xCD641FA73D4FFD4D C1BFB916CB540340,0xD46851900454DC91 BE71D8204442D5E5 si1: 0xDC03FFA725431D4D E40BCA4FC2E0071D,0xDD28B58BCA390DD0 200B20AFEAB32681
 149:  DELTA: 0x46ACC000180CDFFF DDB3EEC70873FC23 -8   40(  16748835)0.000073773  -8   48(  16804298)0.000682072  14 161c(    265780)0.001704525  12  e5c(   1041888)0.019305183
  49:  DELTA: 0xBF0CC000180CDFFF DDB3EEC70873FC23 12  1ac(   1058326)0.000000000  12  dac(   1059512)0.000000000  -6    0(  67180422)0.000000000  -6   20(  67037873)0.000000000  12  5ac(   1039647)0.000000001  12  e5c(   1056936)0.000000108  12  9ac(   1040522)0.000001052  12  25c(   1056402)0.000007026
 249:  DELTA: 0xCE4CC000180CDFFF DDB3EEC70873FC23 12  16c(   1039873)0.000000005  12  69c(   1056832)0.000000250  12  920(   1040498)0.000000872  12  750(   1040800)0.000008950  12  56c(   1055931)0.000223162  12  f50(   1055478)0.005111395  12  e9c(   1041846)0.014654417  12  350(   1055194)0.033027892
  C9:  DELTA: 0x82DCC000180CDFFF DDB3EEC70873FC23 16    0(     67311)0.024882147
 209:  DELTA: 0xEC64C000180CDFFF DDB3EEC70873FC23 14  edb(    258766)0.047457856
 3C9:  DELTA: 0x19BCC000180CDFFF DDB3EEC70873FC23 12  e7c(   1059061)0.000000000  12  a7c(   1039130)0.000000000  12  98c(   1039273)0.000000000  12  58c(   1039733)0.000000002  12  880(   1057062)0.000000039  12  c80(   1040191)0.000000075  12  27c(   1056979)0.000000077  12  18c(   1056842)0.000000230
 1C9:  DELTA: 0x0A7CC000180CDFFF DDB3EEC70873FC23 12   b0(   1058704)0.000000000  12  870(   1058321)0.000000000  12  cb0(   1058044)0.000000000  12   70(   1039197)0.000000000  12  380(   1057886)0.000000000  12  3b0(   1039348)0.000000000  12  470(   1057815)0.000000000  12  770(   1039485)0.000000000
checking pair with DELTA: 0x4BAEDFFF3117BFFF 2C6645F93365ECEA si0: 0xE40A9D2B0465523F 42BC3B3B97D36E8D,0xBB93AB3A34C78A9A 64593536E5C52775 si1: 0x985BBD2BD34D9240 1655F542646D81A3,0x3AB79415CB425F3D 57648C6C00E22BFD
 194:  DELTA: 0x51A95FFF3117BFFF 2C6645F93365ECEA 12  d39(   1041947)0.028354029
checking pair with DELTA: 0x98E3E00009FCA000 0A0A41967884F5D6 si0: 0xDF60BAED6D489376 8C513233A6ED5E42,0x3955813EEBEECC4A 641A81300E22E3A5 si1: 0x467CDAED634BF376 8246F09D2E68686C,0x5ACFED56DDFFBB1B A8D2EF5151DF4E1D
  FF:  DELTA: 0xBD3F400009FCA000 0A0A41967884F5D6 12  507(   1041478)0.001219788
 11F:  DELTA: 0xC5D3400009FCA000 0A0A41967884F5D6 12  5fe(   1041919)0.023635555
 31F:  DELTA: 0x4F13400009FCA000 0A0A41967884F5D6 -6   33(  67163521)0.000085788
  9F:  DELTA: 0xA383400009FCA000 0A0A41967884F5D6 12  eee(   1041253)0.000250523  12  d1e(   1055609)0.002107609  12  dee(   1041717)0.006220077
 37F:  DELTA: 0x68CF400009FCA000 0A0A41967884F5D6 12   db(   1058017)0.000000000  12   e7(   1058942)0.000000000  12  7db(   1059479)0.000000000  12  827(   1059615)0.000000000  12  ce7(   1058144)0.000000000  12  f27(   1058944)0.000000000  -8   66(  16826582)0.000000000  -8   e6(  16822338)0.000000000
 27F:  DELTA: 0x242F400009FCA000 0A0A41967884F5D6 14 34a7(    267131)0.000000000  12   5b(   1064451)0.000000000  12   bb(   1060051)0.000000000  12  137(   1064641)0.000000000  12  14b(   1059721)0.000000000  12  1cb(   1063913)0.000000000  12  227(   1058434)0.000000000  12  237(   1062931)0.000000000
  7F:  DELTA: 0x9AEF400009FCA000 0A0A41967884F5D6 -6   12(  67240251)0.000000000  -6   32(  66978036)0.000000000  -8   52(  16812266)0.000000000  -8   b2(  16742497)0.000000000  12  997(   1057305)0.000000005  -8   72(  16743940)0.000000008  -8   12(  16809826)0.000000030  12  d6b(   1040123)0.000000043
 29F:  DELTA: 0x2CC3400009FCA000 0A0A41967884F5D6 12  61e(   1041497)0.001391186  12  729(   1055557)0.003001539  12  91e(   1041758)0.008181266  12  f29(   1041977)0.034431422
 2FF:  DELTA: 0x467F400009FCA000 0A0A41967884F5D6 -6    a(  67162084)0.000284374  12  c17(   1041969)0.032696417
  3F:  DELTA: 0x89C7400009FCA000 0A0A41967884F5D6 -6   10(  67060938)0.017930280
 21F:  DELTA: 0x0A73400009FCA000 0A0A41967884F5D6 12  4c1(   1055397)0.008769471
 1FF:  DELTA: 0x01DF400009FCA000 0A0A41967884F5D6 -6   12(  67158377)0.005435176  12  ceb(   1055462)0.005689278
  5F:  DELTA: 0x925B400009FCA000 0A0A41967884F5D6 12  8c6(   1055190)0.033889247
 3FF:  DELTA: 0x8B1F400009FCA000 0A0A41967884F5D6 -6    6(  67166263)0.000008006  -6   26(  67061803)0.033983055
 17F:  DELTA: 0xDF8F400009FCA000 0A0A41967884F5D6 12   17(   1059398)0.000000000  12   c7(   1058635)0.000000000  12   f7(   1058936)0.000000000  12  15b(   1059633)0.000000000  12  1a7(   1061491)0.000000000  12  25b(   1061656)0.000000000  12  2a7(   1061160)0.000000000  12  307(   1058587)0.000000000
checking pair with DELTA: 0x9C385FFFD517C000 306C5C1B4D5AB7C6 si0: 0x3FEAE99F258B5DE7 09F745C8569D55EC,0x7F3878FC16BE1A2D 2DD9E922BA431F8B si1: 0xA3B2899F50739DE6 D98AE9AD09429E26,0xAAC4D1CD87D69C2E 70FEC2F54B2666C3
 2D6:  DELTA: 0x2D609FFFD517C000 306C5C1B4D5AB7C6 14 3eb0(    266018)0.000054283
checking pair with DELTA: 0x886D80002EBC1FFF EBA31452485D63B9 si0: 0xBF85BAB7F7A88915 DD24DFCE63656BDD,0x1D5F2A9E89966A03 3142879BE63649DD si1: 0x37183AB7C8EC6915 F181CB7C1B080824,0xC92276BC926F7002 47E39578BCDCFFF1
checking pair with DELTA: 0x3043000012E7A000 4F55B89E15612A9D si0: 0xB52CE7856DE4DCAE 576D37344D0DE947,0x9055C7B98826164F 4FAAEB6EB2A3DC9F si1: 0x84E9E7855AFD3CAE 08177E9637ACBEAA,0xA6DE91805DB5FD1D B80EAD3EBF24CB03
checking pair with DELTA: 0xDB780000301CDFFF EFC16152F51E792F si0: 0x63684BE2D0185347 9CCA1B957F53D05A,0x80FA168A7F074369 F0A5BC2519C96861 si1: 0x87F04BE29FFB7347 AD08BA428A35572B,0x5B97EE447686113E FB2C8C166671C56D
  8B:  DELTA: 0x37BD6000301CDFFF EFC16152F51E792F 14 3f77(    257778)0.000000015  14 1f77(    266226)0.000002249  -8   f6(  16747303)0.000004836  14   8b(    258467)0.000761714  14  e4b(    258533)0.001952468  14 208b(    265599)0.020378990  -8   b6(  16801849)0.032928680  14 1e4b(    265538)0.045762519
 1CB:  DELTA: 0x1AE56000301CDFFF EFC16152F51E792F -8   90(  16809736)0.000000035  -8   d0(  16749917)0.000465123  -8   10(  16803470)0.002631082
 28B:  DELTA: 0x6FFD6000301CDFFF EFC16152F51E792F -8   3a(  16812939)0.000000000  -8   fa(  16744839)0.000000045  -8   7a(  16749044)0.000105850  -8   92(  16749408)0.000197290  -8   12(  16750130)0.000662933  -8   ba(  16802080)0.023187546  -8   52(  16802065)0.023723934
 3CB:  DELTA: 0x53256000301CDFFF EFC16152F51E792F 14  1ec(    266870)0.000000000  14  bec(    267106)0.000000000  14  e1c(    267279)0.000000000  14 149c(    266995)0.000000000  14 18dc(    267103)0.000000000  14 249c(    267132)0.000000000  14 3e1c(    267193)0.000000000  14  49c(    257280)0.000000000
 38B:  DELTA: 0x8C1D6000301CDFFF EFC16152F51E792F 14 3f07(    265916)0.000243974  14 1c0b(    265914)0.000251174  14  c0b(    258412)0.000343271  14 38b7(    265749)0.002629606  14 20fb(    258612)0.005894475  14 23f7(    265565)0.032043522  14 30fb(    265540)0.044574402
 18B:  DELTA: 0x53DD6000301CDFFF EFC16152F51E792F 14 3c67(    266192)0.000003825  14  39b(    266165)0.000005816  -8   d2(  16747988)0.000016635  14 2f9b(    266035)0.000042097  14 3067(    258273)0.000043503  14 2c67(    258349)0.000135828  14  7d7(    265938)0.000177004  -8   92(  16804377)0.000598385
 24B:  DELTA: 0xA8F56000301CDFFF EFC16152F51E792F 14 1480(    265808)0.001148715
checking pair with DELTA: 0xAC575FFFEABDA000 4916E783B7945A0F si0: 0xEA55F2E4610D825A C9C89FD187E1A389,0xA809E3872B037628 4612D5C116F50F8F si1: 0x3DFE92E4764FE25A 80B1B84DD04D497A,0x88A56FF680022BCF 6C25989E452C8A1B
   B:  DELTA: 0x363E7FFFEABDA000 4916E783B7945A0F -6   12(  67172158)0.000000033

real	519m55,784s
user	13633m42,794s
sys	8m12,787s

Stage 1, filling set with random increments
set size:16384, stream_length:4294967296, top bits:19
Stage 2, Sorting streams
Stage 3, Comparing streams
pairs to consider: 5
checking pair with DELTA: 0xBF6CFFFFE20A6000 1237BFB750ABC5D4 si0: 0x7B36DD275880DB17 CADB2D976848C718,0xF82B95907E57998B 00CC0B074806B4AB si1: 0xBBC9DD2776767B17 B8A36DE0179D0144,0x9B02AF6C89EFC1D7 EA92FCC9A4D2ADBB
clean table 32 18
fill table 32 18 to 4220518
clean table 32 16
fill table 32 16 to 4299161
clean table 32 14
fill table 32 14 to 4613734
clean table 32 12
fill table 32 12 to 5872025
clean table 32 10
fill table 32 10 to 10905190
clean table 32 8
fill table 32 8 to 31037849
clean table 32 6
fill table 32 6 to 111568486
clean table 32 4
fill table 32 4 to 433691033
clean table 32 2
fill table 32 2 to 1722181222
 10E:  DELTA: 0x3D063FFFE20A6000 1237BFB750ABC5D4 12  230(   1058325)0.000000000  12  530(   1057565)0.000000000  12  1f0(   1057502)0.000000000  12  130(   1040174)0.000000009  12  2f0(   1040192)0.000000010  12  930(   1056696)0.000000099  12  af0(   1056646)0.000000146  12  a30(   1040910)0.000002765
checking pair with DELTA: 0x80B2C00052319FFF 85F7EBD9A5CACFEE si0: 0xFE11F43F71412700 F6B7489194A2D4B7,0xE2AF29ED1B20D157 01C834D59668D06F si1: 0x7D5F343F1F0F8701 70BF5CB7EED804C9,0xD8493089E0552D49 D998F77D58F2D3C7
 1F4:  DELTA: 0x6FCE400052319FFF 85F7EBD9A5CACFEE -6    c(  67043292)0.000000000  -6   2c(  67161936)0.000043402
 3F4:  DELTA: 0x608E400052319FFF 85F7EBD9A5CACFEE 12   70(   1042089)0.009535608  12  f70(   1055012)0.014186071  12  3b0(   1042190)0.018029377
 2F4:  DELTA: 0xE82E400052319FFF 85F7EBD9A5CACFEE -6    4(  67198002)0.000000000  -6   24(  67032108)0.000000000  12  ec0(   1040797)0.000001182  -8   44(  16801938)0.003888811  -8   c4(  16801805)0.004755570  12  d0c(   1041987)0.004963123  12  6c0(   1054956)0.020123604  12  ac0(   1054907)0.027260483
  F4:  DELTA: 0xF76E400052319FFF 85F7EBD9A5CACFEE 12  800(   1056020)0.000015939
checking pair with DELTA: 0xB1E3C0007875E000 2C73576CA4DE67DE si0: 0xCB1F2D8DA9A5ACB8 079DA5BC62AB8EF8,0xF136A63903CFB057 99427D8A8F043A5D si1: 0x193B6D8D312FCCB7 DB2A4E4FBDCD271A,0x009C79F3B34D2CD3 22B43E5851BF9275
 1B4:  DELTA: 0x6B7C40007875E000 2C73576CA4DE67DE 14 2b00(    257580)0.000000000  14 1b00(    258140)0.000000760  14 3b00(    265893)0.000046023  14 1700(    258412)0.000046388  14  b00(    265785)0.000214713  14  700(    265705)0.000653519  14 3700(    265657)0.001259998  14 2700(    258731)0.004023711
checking pair with DELTA: 0x45728000B24DC000 1AFDFE5D4232E832 si0: 0x3456181E0244AE3D 9CFEF32202AB3CE3,0xC8CB61ECFAF163DF C41D4E73484C4E1D si1: 0xEEE3981D4FF6EE3D 8200F4C4C07854B1,0xDD8334AD194ECC83 F2A9D06A4F12C145
  46:  DELTA: 0x07CF4000B24DC000 1AFDFE5D4232E832 -6    c(  67065092)0.047214990
 146:  DELTA: 0xD5EF4000B24DC000 1AFDFE5D4232E832 14  700(    267561)0.000000000  14  b00(    267517)0.000000000  14 3700(    268055)0.000000000  14 3b00(    267320)0.000000000  12  700(   1065199)0.000000000  12  b00(   1065864)0.000000000  12  f00(   1031945)0.000000000  12  300(   1035611)0.000000000
 346:  DELTA: 0x722F4000B24DC000 1AFDFE5D4232E832 -6    4(  67216858)0.000000000  -6   24(  67010410)0.000000000  12  1c0(   1038151)0.000000000  12  2c0(   1038787)0.000000000  12  6c0(   1057954)0.000000000  12  5c0(   1057500)0.000000000  12  9c0(   1057053)0.000000006  12  ec0(   1040158)0.000000008

real	118m52,649s
user	2853m41,628s
sys	1m57,080s

A hypothetical example showing that this can be a real issue under the right/wrong circumstances

For research (in physics for example) 5 sigma is a requirement and that is for the experiment not for the PRNG so it would be unreasonable if the PRNG would consume more than 1% of that. If we start from where we left of, we have 115 bits in the seed to begin with, minus 32 as we will consume 2³² from each stream and minus 30 for the ~1% at 5 sigma. That leaves us with 53 bits in the seed, we now add 29 bits from the increment leaving us with 82 bits. These 82 bit would be expected to be affected by a collision if more than half are used (like in a birthday attack). That leaves us with being able to consume 2⁴¹ streams assuming they are perfectly randomly initialized. This compares to an ideal 2×128 bit seeded PRNG where we could consume 2⁹⁷ with the same 5 sigma and 2³² read constraints. How long would a supercomputer need to hit a correlated pair ? If we assume PCG-DXSM needs 10 CPU operations per output and we read 2³² per stream as in our code and a 1000 0000 Terra(fl)op/sec supercomputer is used. It would take about a day. 2⁴¹ * 2³² * 10 / ( 1000 0000 * 1000⁴ sec^-1 * 3600 sec/h * 24 h/day) = 1.09 … days The reader is here also reminded that we have not attempted to exploit the 2nd XOR in DXSM. Nor that the statistical tests used are very sophisticated. And that we are reading here 2³² values for each stream. A system favoring reading more on seemingly correlated sequences and less on others would need less data to run into problems

A workaround

The obvious one, is not to use PCG-DXSM with (increment based) streams. But if continued use of PCG-DXSM with streams is a goal, and if we assume the described issue here represents all the correlated streams (which certainly is not true as we ignored the 2nd XOR and likely have not found all correlation cases beyond that either and also only considered reasonable aligned streams, theres also other little details here and there I knowingly did not pay adequate attention to). Then one can construct increments instead of randomly in a systematic way so they always differ in bits that have been required to be identical in a collision. That can be achieved easily by using a counter fed through some bijective function for some extra pseudo randomness. The remaining bits could then still be set somewhat randomly. With that and 21 bits instead of 19 bits as in the attack removed (I have also found a collision with 2x21bits randomized) we have 127 – 63 – 2×21 = 22 bits left and thus ~4 million increments, given there are 2 odd values modulo 4 there are ~8 million non correlated streams of full 2¹²⁸ period (if we limit our self to what the implementation here can find) thus 2¹²⁸⁺²³. To clarify the differing bits we talk about are in the delta between aligned LCG outputs and not the increment values directly.

A solution

To solve this, first we need to understand why PCG has repeatedly been plagued by stream correlation issues, both the PCG “1.0” and now PCG 2.0 (DXSM). The reason appears to be because when using the increment and the seed and setting them randomly one inevitably challenges the mixer after the LCG with every constant difference. That is fundamentally very similar to feeding a mixer with a counter instead of a LCG. But the mixers seem designed to require more randomness than a counter provides.
The solution to this seems to be to give up on one of the assumptions leading to this, and these are 1. LCG, 2. random use of the increment, 3. a “simple” mixer.
Now if the simple mixer is replaced by a complex one than the LCG can be replaced by a counter more like in XSM64 because if the mixer is strong enough to work with a counter as input then the extra multiplication of the LCG no longer is useful and just slows it down.
I also do believe the solution is to spend more time analyzing the existing PRNGs instead of creating new ones. There are PRNGs that are fast and that have no known issues and seem to have received limited attention in terms of analyzing them. IMHO a new generator should only be created when its more optimal in some way than all other existing generators.

Final words

Iam curious if XSM64 has such issues too. From a quick glance it seemed designed with this sort of “correlation from offset” problem in mind, or at least i was too dumb to immediately produce a correlation the same way it works here. Sadly i did not have enough time to really investigate XSM64 so thats not meant as a recommendation, just a statement of myself being curious about …
I hope this whole blog entry is helpful to someone

More issues

After writing above, it came to my mind that as i already have the code to test for correlations between streams, it would be very easy to test this within the same period / LCG / stream / increment. After all we see correlations between streams that are misaligned by multiplies of 2¹⁰⁹ with different increments. So what about within teh same stream/increment? Testing streams of 2²⁶ length and misaligned by all the 65536 possibilities of the top 16 bits of the 128 bit. The 3 pairs tested found 13409, 15950 and 15360 correlations. I also tried a random subset of the top 32bit instead of 16 but these found no additional correlations with my quick tests. Retesting this with practrand to be more sure my tests are not all faulty. And practrand seems better in detecting these.

    STATET inc1 = 12345;
    STATET inc2 = inc1;
    STATET state1 = 31415;
    STATET state2 = step(state1, inc1, ((STATET)1)<<109);

    int64_t stream_length = 1LL<<60;
    for(int64_t i = 0; i< stream_length; i++) {
        OUTT v1, v2;
        v1 = pcg_dxsm(&state1, multiplier, inc1);
        v2 = pcg_dxsm(&state2, multiplier, inc2);

        v1 ^= v2;
        fwrite(&v1, 8, 1, stdout);
    }

step 1<<109
RNG_test using PractRand version 0.95
RNG = RNG_stdin64, seed = unknown
test set = core, folding = standard (64 bit)

rng=RNG_stdin64, seed=unknown
length= 128 megabytes (2^27 bytes), time= 3.6 seconds
  no anomalies in 196 test result(s)

rng=RNG_stdin64, seed=unknown
length= 256 megabytes (2^28 bytes), time= 8.4 seconds
  no anomalies in 213 test result(s)

rng=RNG_stdin64, seed=unknown
length= 512 megabytes (2^29 bytes), time= 16.3 seconds
  no anomalies in 229 test result(s)

rng=RNG_stdin64, seed=unknown
length= 1 gigabyte (2^30 bytes), time= 29.9 seconds
  no anomalies in 246 test result(s)

rng=RNG_stdin64, seed=unknown
length= 2 gigabytes (2^31 bytes), time= 52.3 seconds
  Test Name                         Raw       Processed     Evaluation
  FPF-14+6/16:cross                 R=  +4.9  p =  1.8e-4   unusual
  ...and 262 test result(s) without anomalies

rng=RNG_stdin64, seed=unknown
length= 4 gigabytes (2^32 bytes), time= 94.4 seconds
  no anomalies in 279 test result(s)

rng=RNG_stdin64, seed=unknown
length= 8 gigabytes (2^33 bytes), time= 179 seconds
  no anomalies in 295 test result(s)

rng=RNG_stdin64, seed=unknown
length= 16 gigabytes (2^34 bytes), time= 342 seconds
  no anomalies in 311 test result(s)

rng=RNG_stdin64, seed=unknown
length= 32 gigabytes (2^35 bytes), time= 663 seconds
  Test Name                         Raw       Processed     Evaluation
  [Low16/64]FPF-14+6/16:all         R=  -5.9  p =1-2.3e-5   mildly suspicious
  ...and 324 test result(s) without anomalies

rng=RNG_stdin64, seed=unknown
length= 64 gigabytes (2^36 bytes), time= 1298 seconds
  Test Name                         Raw       Processed     Evaluation
  FPF-14+6/16:(15,14-3)             R=  +8.7  p =  2.3e-7   mildly suspicious
  FPF-14+6/16:all                   R=  +5.6  p =  1.0e-4   unusual
  TMFn(2+6):wl                      R= +25.9  p~=   1e-8    very suspicious
  [Low16/64]TMFn(2+4):wl            R= +28.2  p~=   7e-10   very suspicious
  [Low4/64]TMFn(2+2):wl             R= +25.1  p~=   3e-8    suspicious
  ...and 335 test result(s) without anomalies

rng=RNG_stdin64, seed=unknown
length= 128 gigabytes (2^37 bytes), time= 2268 seconds
  Test Name                         Raw       Processed     Evaluation
  FPF-14+6/16:(15,14-2)             R= +15.2  p =  4.3e-13   VERY SUSPICIOUS 
  FPF-14+6/16:all                   R= +10.3  p =  3.9e-9    VERY SUSPICIOUS 
  TMFn(2+6):wl                      R= +58.1  p~=   7e-28     FAIL !!        
  TMFn(2+7):wl                      R= +35.3  p~=   3e-14     FAIL           
  TMFn(2+8):wl                      R= +29.8  p~=   7e-11    VERY SUSPICIOUS 
  [Low16/64]FPF-14+6/16:cross       R=  +5.8  p =  3.2e-5   mildly suspicious
  [Low16/64]TMFn(2+4):wl            R= +58.1  p~=   6e-28     FAIL !!        
  [Low16/64]TMFn(2+5):wl            R= +36.2  p~=   1e-14     FAIL           
  [Low16/64]TMFn(2+6):wl            R= +31.1  p~=   1e-11    VERY SUSPICIOUS 
  [Low16/64]TMFn(2+7):wl            R= +19.3  p~=   1e-5    unusual          
  [Low4/64]TMFn(2+2):wl             R= +59.6  p~=   9e-29     FAIL !!        
  [Low4/64]TMFn(2+3):wl             R= +34.9  p~=   6e-14     FAIL           
  [Low4/64]TMFn(2+4):wl             R= +31.0  p~=   1e-11    VERY SUSPICIOUS 
  ...and 342 test result(s) without anomalies


step 1<<110
RNG_test using PractRand version 0.95
RNG = RNG_stdin64, seed = unknown
test set = core, folding = standard (64 bit)

rng=RNG_stdin64, seed=unknown
length= 128 megabytes (2^27 bytes), time= 2.4 seconds
  no anomalies in 196 test result(s)

rng=RNG_stdin64, seed=unknown
length= 256 megabytes (2^28 bytes), time= 6.5 seconds
  no anomalies in 213 test result(s)

rng=RNG_stdin64, seed=unknown
length= 512 megabytes (2^29 bytes), time= 14.6 seconds
  no anomalies in 229 test result(s)

rng=RNG_stdin64, seed=unknown
length= 1 gigabyte (2^30 bytes), time= 26.1 seconds
  no anomalies in 246 test result(s)

rng=RNG_stdin64, seed=unknown
length= 2 gigabytes (2^31 bytes), time= 49.2 seconds
  no anomalies in 263 test result(s)

rng=RNG_stdin64, seed=unknown
length= 4 gigabytes (2^32 bytes), time= 94.8 seconds
  no anomalies in 279 test result(s)

rng=RNG_stdin64, seed=unknown
length= 8 gigabytes (2^33 bytes), time= 180 seconds
  Test Name                         Raw       Processed     Evaluation
  FPF-14+6/16:(16,14-6)             R=  +7.8  p =  5.1e-6   unusual
  ...and 294 test result(s) without anomalies

rng=RNG_stdin64, seed=unknown
length= 16 gigabytes (2^34 bytes), time= 325 seconds
  Test Name                         Raw       Processed     Evaluation
  FPF-14+6/16:(16,14-5)             R= +12.5  p =  3.4e-10  very suspicious
  [Low16/64]FPF-14+6/16:all         R=  +4.9  p =  4.5e-4   unusual
  ...and 309 test result(s) without anomalies

rng=RNG_stdin64, seed=unknown
length= 32 gigabytes (2^35 bytes), time= 599 seconds
  Test Name                         Raw       Processed     Evaluation
  FPF-14+6/16:(16,14-5)             R= +25.2  p =  8.7e-21    FAIL !
  ...and 324 test result(s) without anomalies


step 1<<111
RNG_test using PractRand version 0.95
RNG = RNG_stdin64, seed = unknown
test set = core, folding = standard (64 bit)

rng=RNG_stdin64, seed=unknown
length= 128 megabytes (2^27 bytes), time= 2.6 seconds
  no anomalies in 196 test result(s)

rng=RNG_stdin64, seed=unknown
length= 256 megabytes (2^28 bytes), time= 7.7 seconds
  no anomalies in 213 test result(s)

rng=RNG_stdin64, seed=unknown
length= 512 megabytes (2^29 bytes), time= 15.1 seconds
  no anomalies in 229 test result(s)

rng=RNG_stdin64, seed=unknown
length= 1 gigabyte (2^30 bytes), time= 28.4 seconds
  no anomalies in 246 test result(s)

rng=RNG_stdin64, seed=unknown
length= 2 gigabytes (2^31 bytes), time= 52.3 seconds
  no anomalies in 263 test result(s)

rng=RNG_stdin64, seed=unknown
length= 4 gigabytes (2^32 bytes), time= 92.2 seconds
  no anomalies in 279 test result(s)

rng=RNG_stdin64, seed=unknown
length= 8 gigabytes (2^33 bytes), time= 172 seconds
  Test Name                         Raw       Processed     Evaluation
  FPF-14+6/16:(17,14-7)             R=  +9.1  p =  4.5e-7   mildly suspicious
  [Low16/64]FPF-14+6/16:cross       R=  +7.8  p =  6.4e-7   suspicious
  ...and 293 test result(s) without anomalies

rng=RNG_stdin64, seed=unknown
length= 16 gigabytes (2^34 bytes), time= 337 seconds
  Test Name                         Raw       Processed     Evaluation
  FPF-14+6/16:(17,14-6)             R= +16.4  p =  1.4e-12   VERY SUSPICIOUS
  FPF-14+6/16:(20,14-8)             R= +16.1  p =  1.1e-11   VERY SUSPICIOUS
  FPF-14+6/16:cross                 R=  +6.5  p =  8.5e-6   mildly suspicious
  [Low16/64]FPF-14+6/16:cross       R= +18.2  p =  1.2e-15    FAIL !
  ...and 307 test result(s) without anomalies


step 1<<112
RNG_test using PractRand version 0.95
RNG = RNG_stdin64, seed = unknown
test set = core, folding = standard (64 bit)

rng=RNG_stdin64, seed=unknown
length= 128 megabytes (2^27 bytes), time= 2.8 seconds
  Test Name                         Raw       Processed     Evaluation
  FPF-14+6/16:cross                 R=  +7.9  p =  5.5e-7   suspicious
  ...and 195 test result(s) without anomalies

rng=RNG_stdin64, seed=unknown
length= 256 megabytes (2^28 bytes), time= 7.9 seconds
  Test Name                         Raw       Processed     Evaluation
  FPF-14+6/16:cross                 R= +16.0  p =  7.3e-14    FAIL
  ...and 212 test result(s) without anomalies


step 1<<115
RNG_test using PractRand version 0.95
RNG = RNG_stdin64, seed = unknown
test set = core, folding = standard (64 bit)

rng=RNG_stdin64, seed=unknown
length= 128 megabytes (2^27 bytes), time= 2.9 seconds
  Test Name                         Raw       Processed     Evaluation
  FPF-14+6/16:cross                 R= +67.0  p =  9.4e-57    FAIL !!!!
  TMFn(2+0):wl                      R= +23.5  p~=   1e-7    suspicious
  [Low16/64]FPF-14+6/16:(4,14-3)    R= +13.9  p =  6.3e-12   VERY SUSPICIOUS
  [Low16/64]FPF-14+6/16:(5,14-4)    R=  +8.1  p =  1.4e-6   unusual
  [Low16/64]FPF-14+6/16:all         R=  +6.4  p =  1.5e-5   mildly suspicious
  [Low16/64]FPF-14+6/16:cross       R=  +6.5  p =  1.3e-5   mildly suspicious
  ...and 190 test result(s) without anomalies


step 1<<120
RNG_test using PractRand version 0.95
RNG = RNG_stdin64, seed = unknown
test set = core, folding = standard (64 bit)

rng=RNG_stdin64, seed=unknown
length= 128 megabytes (2^27 bytes), time= 2.6 seconds
  Test Name                         Raw       Processed     Evaluation
  BCFN(2+13,13-9,T)                 R= +26.0  p =  5.7e-7   mildly suspicious
  Gap-16:A                          R=  +6.9  p =  2.5e-5   mildly suspicious
  FPF-14+6/16:(0,14-0)              R=  +7.5  p =  1.6e-6   unusual
  FPF-14+6/16:(1,14-0)              R= +18.0  p =  2.8e-16    FAIL
  FPF-14+6/16:(2,14-0)              R= +16.0  p =  1.8e-14    FAIL
  FPF-14+6/16:(3,14-1)              R= +18.7  p =  2.4e-16    FAIL
  FPF-14+6/16:(4,14-2)              R=  +7.5  p =  2.2e-6   unusual
  FPF-14+6/16:(5,14-2)              R= +10.7  p =  3.7e-9   very suspicious
  FPF-14+6/16:(6,14-3)              R= +10.7  p =  3.8e-9   very suspicious
  FPF-14+6/16:(7,14-4)              R= +25.3  p =  1.4e-20    FAIL !
  FPF-14+6/16:all                   R= +94.7  p =  2.4e-88    FAIL !!!!!
  FPF-14+6/16:cross                 R= +2465  p =  3e-2074    FAIL !!!!!!!!
  TMFn(2+0):wl                      R=+968.3  p~=   6e-576    FAIL !!!!!!!
  TMFn(2+1):wl                      R=+487.2  p~=   2e-286    FAIL !!!!!!
  TMFn(2+2):wl                      R=+232.7  p~=   5e-133    FAIL !!!!!
  [Low16/64]BCFN(2+1,13-4,T)        R= +10.9  p =  1.0e-4   unusual


step 1<<127
RNG_test using PractRand version 0.95
RNG = RNG_stdin64, seed = unknown
test set = core, folding = standard (64 bit)

rng=RNG_stdin64, seed=unknown
length= 128 megabytes (2^27 bytes), time= 2.5 seconds
  Test Name                         Raw       Processed     Evaluation
  BCFN(2+0,13-3,T)                  R=+24986  p = 0           FAIL !!!!!!!!
  BCFN(2+1,13-3,T)                  R=+24275  p = 0           FAIL !!!!!!!!
  BCFN(2+2,13-3,T)                  R=+24133  p = 0           FAIL !!!!!!!!
  BCFN(2+3,13-3,T)                  R=+23873  p = 0           FAIL !!!!!!!!
  BCFN(2+4,13-4,T)                  R=+30456  p = 0           FAIL !!!!!!!!
  BCFN(2+5,13-5,T)                  R=+37691  p = 0           FAIL !!!!!!!!
  BCFN(2+6,13-5,T)                  R=+36654  p = 0           FAIL !!!!!!!!
  BCFN(2+7,13-6,T)                  R=+43148  p = 0           FAIL !!!!!!!!
  BCFN(2+8,13-6,T)                  R=+39020  p = 0           FAIL !!!!!!!!
  BCFN(2+9,13-7,T)                  R=+39135  p = 0           FAIL !!!!!!!!
  BCFN(2+10,13-8,T)                 R=+33354  p =  8e-8466    FAIL !!!!!!!!
  BCFN(2+11,13-8,T)                 R=+19688  p =  1e-4997    FAIL !!!!!!!!
  BCFN(2+12,13-9,T)                 R=+11650  p =  7e-2619    FAIL !!!!!!!!
  BCFN(2+13,13-9,T)                 R= +5830  p =  3e-1311    FAIL !!!!!!!!
  DC6-9x1Bytes-1                    R= +2564  p =  1e-1574    FAIL !!!!!!!!
  Gap-16:A                          R=+362.0  p =  3.5e-308   FAIL !!!!!!
  Gap-16:B                          R=+576.2  p =  2.4e-461   FAIL !!!!!!!
  FPF-14+6/16:(2,14-0)              R=+663.1  p =  2.7e-613   FAIL !!!!!!!
...

Filed under: Cryptanalysis,Pseudo random number generators — Michael @ 14:27

Comments (0)

2024-01-24

How correlated is pcg-dxsm output?

Correlations based on the lower half of the LCG

Lets try picking some correlated combinations with (using my very simple pcg-dxsm implementation)

Multiplier	15750249268501108917	15750249268501108917	15750249268501108917
Increment	123456789	15750249268624565705	31500498537125674621
Seed	31415	31414	31413
Output 1	0x563B0DB76F45EEC2	0xDEE2CC4990CDBE1E	0xDEE2CC4990CDBE1E
Output 2	0x064C612B5E0A4853	0x064C612B5E0A4853	0x704228ECB09FC1D5
Output 3	0x4727428C93C1A285	0x5A7E3EA39E71C29F	0x5A7E3EA39E71C29F
Output 4	0x2745482D3BDBC1C0	0x2745482D3BDBC1C0	0x15D15980AE293340
Output 5	0x6055DF5FCE178AB2	0xC5DC9DC897A78F1E	0xC5DC9DC897A78F1E
Output 6	0x05DDF19C395C342F	0x05DDF19C395C342F	0x51705B299006C1C1
Output 7	0xD7AFDA4C1654626B	0x89A204161A53CA99	0x89A204161A53CA99
Output 8	0x2C1FCD6649DB8016	0x2C1FCD6649DB8016	0x338710EA7C027382
Output 9	0xEDE28080A228A6DC	0xED272B75717128A4	0xED272B75717128A4
Output 10	0xB05EFD2651E59BC7	0xB05EFD2651E59BC7	0x20911E0B5D6AE551
Output 11	0xD6045D380A566F2D	0xE6584BDE2C470387	0xE6584BDE2C470387
Output 12	0x1BF91549D870370F	0x1BF91549D870370F	0xF434399E25ED24B5
Output 13	0x1D9EB55CB071B60A	0x1DA3A160F1C93F66	0x1DA3A160F1C93F66
Output 14	0x892A16096BF94286	0x892A16096BF94286	0x8F5678078367E87A
Output 15	0xA7292EEE7AF4777F	0x281913D2D3F2DFC5	0x281913D2D3F2DFC5
Output 16	0x08AB29210A0B6FF1	0x08AB29210A0B6FF1	0xC0DD5B88BF6885A3
Output 17	0x20E581A9EA8BF155	0xB17BBB3957418DBB	0xB17BBB3957418DBB
Output 18	0x768E7BD7B5223C36	0x768E7BD7B5223C36	0x086B7F8F81A5CEDA
Output 19	0xF67313E1E5454A96	0xD3026BC096FEDDE2	0xD3026BC096FEDDE2
Output 20	0xAB091F6C4C169476	0xAB091F6C4C169476	0xF29E461BBD7C6952
Output 21	0xB18D1BC34F78E4D1	0x205033BF4C77D237	0x205033BF4C77D237
Output 22	0x5BBCD31929B10ED3	0x5BBCD31929B10ED3	0xFDCB8D05F8C862FD
Output 23	0xC64AFD5DFC7654A7	0x54C875057AAA9DCD	0x54C875057AAA9DCD
Output 24	0x34866A3DF2054E13	0x34866A3DF2054E13	0x4C9D903AB9738D79
Output 25	0x29CA92D156E93DD4	0x1174B427B597C82C	0x1174B427B597C82C
Output 26	0x46B0D6BFFE3BD1C3	0x46B0D6BFFE3BD1C3	0x18A4BCCC15242755
Output 27	0x74DF62C8A26F65CD	0xF5CAC48CFC7B4D47	0xF5CAC48CFC7B4D47
Output 28	0xE737155A33FC916F	0xE737155A33FC916F	0x2A8E83BF16B3C6F5
Output 29	0x0E0E87DB2724235B	0x82EA039F9945EE0D	0x82EA039F9945EE0D
Output 30	0x2367AB4D5FA3ABBE	0x2367AB4D5FA3ABBE	0x274FC0D71A60FF82

Besides the exactly equal ones, the more careful reader will have noticed that each line either has even or odd entries, so the 3 seed/increment pairs generate equal results in the lowest bit.

You might ask why does this all happen? Well, the identical numbers happen because the lowest bit is ored away in DXSM, that is forced to be 1. So any cases differing only by that bit at the LCG stage become identical in later stages. The other similarities happen because the mixing isnt as good as it needs to be to remove all differences from very similar input and LCGs can produce such very similar input when one picks the “wrong” parameters.

Did i hand pick the worst ? Is that the only case ?
lets see

Multiplier	15750249268501108917		15750249268501108917
Increment	123456789		1134925067785824341058664979018534165
Seed	31415		340282366920938463463302549837730314935
	0x56	3B0DB76F45EEC2	0x04	3B0DB76F45EEC2
	0x06	4C612B5E0A4853	0xC7	4C612B5E0A4853
	0x47	27428C93C1A285	0x54	27428C93C1A285
	0x27	45482D3BDBC1C0	0xE7	45482D3BDBC1C0
	0x60	55DF5FCE178AB2	0x96	55DF5FCE178AB2
	0x05	DDF19C395C342F	0xCE	DDF19C395C342F
	0xD7	AFDA4C1654626B	0xEE	AFDA4C1654626B
	0x2C	1FCD6649DB8016	0xE2	1FCD6649DB8016
	0xED	E28080A228A6DC	0xD1	E28080A228A6DC
	0xB0	5EFD2651E59BC7	0x75	5EFD2651E59BC7
	0xD6	045D380A566F2D	0x03	045D380A566F2D
	0x1B	F91549D870370F	0xEE	F91549D870370F
	0x1D	9EB55CB071B60A	0xCB	9EB55CB071B60A
	0x89	2A16096BF94286	0x83	2A16096BF94286
	0xA7	292EEE7AF4777F	0xCA	292EEE7AF4777F
	0x08	AB29210A0B6FF1	0xE1	AB29210A0B6FF1
	0x20	E581A9EA8BF155	0x53	E581A9EA8BF155
	0x76	8E7BD7B5223C36	0xC8	8E7BD7B5223C36
	0xF6	7313E1E5454A96	0x9C	7313E1E5454A96
	0xAB	091F6C4C169476	0x19	091F6C4C169476
	0xB1	8D1BC34F78E4D1	0x64	8D1BC34F78E4D1
	0x5B	BCD31929B10ED3	0x70	BCD31929B10ED3
	0xC6	4AFD5DFC7654A7	0x59	4AFD5DFC7654A7
	0x34	866A3DF2054E13	0xE7	866A3DF2054E13
	0x29	CA92D156E93DD4	0x55	CA92D156E93DD4
	0x46	B0D6BFFE3BD1C3	0x0F	B0D6BFFE3BD1C3
	0x74	DF62C8A26F65CD	0x31	DF62C8A26F65CD
	0xE7	37155A33FC916F	0xAA	37155A33FC916F
	0x0E	0E87DB2724235B	0x67	0E87DB2724235B
	0x23	67AB4D5FA3ABBE	0x05	67AB4D5FA3ABBE

Here 7 out of 8 bytes match apparently for each. Is that the worst ? no we could make part of the first byte match too, it just becomes harder to color the table to match that.

Correlations based on the upper half of the LCG

Update 2024-03-02, After thynson noticed that the above 2 classes of correlations, all have multiplier-1 as a coresidual of the increment. Ive looked more and found more classes of correlations.
The above 2 cases produce correlations because they introduce 2 specific classes of differences in the lower half of the underlying LCG, we can also introduce other specific differences in the upper half to create correlations. I first thought the mixing stage would make this difficult, but it doesnt seem that difficult.

Multiplier	15750249268501108917		15750249268501108917		15750249268501108917		15750249268501108917
Increment	123456789		188175727042032768542098945617541451029		94087863521016384271049472808832453909		94087863521016384271049472808832453909
Seed	31415		170141183500083312988819472512656112311		85070591750041656494409736256328071863		255211775210510888226097039972212177591
Output1	0x563B0DB76F45E	EC2	0x4EAC9CA5AB826	EC2	0x5273D52E8D642	EC2	0x8BA757339BF3B	0C4
Output2	0x064C612B5E0A4	853	0x507FC7CA7B00C	853	0x7CF75FC4DBFEB	B3C	0x6132ADDBCF8F0	853
Output3	0x4727428C93C1A	285	0xE8681791D88ED	60B	0x821B65AAAE9D3	074	0x56EBE12BED519	60B
Output4	0x2745482D3BDBC	1C0	0x7D02EE9F8B954	1C0	0x7C6674F413FF0	1C0	0x064DB7AF79614	8E3
Output5	0x6055DF5FCE178	AB2	0x5284D85445650	AB2	0xCB9C54CE810BC	AB2	0x89BF5C086BDE9	447
Output6	0x05DDF19C395C3	42F	0xDA6CAC69AB735	98E	0x979601E7DBD3B	9D1	0x41C4B15A978E7	42F
Output7	0xD7AFDA4C16546	26B	0xD22DEF5B1E600	21E	0x81EE59290B934	21E	0xED6327A2A5747	2A1
Output8	0x2C1FCD6649DB8	016	0x3CD4A9AF3862F	39A	0x12F825A58C46C	679	0x798CE3702E873	39A
Output9	0xEDE28080A228A	6DC	0x8443929D63A67	C1C	0x02CE2330E2663	F1F	0xE422EEFF6C846	6DC
Output10	0xB05EFD2651E59	BC7	0x5BB46EBB794BC	378	0x9EC419F1B9850	378	0x3DD8F2BA415FE	2CE
Output11	0xD6045D380A566	F2D	0x909115D376D6E	F2D	0xB34AB985C096A	F2D	0xA49969ED9AE37	EBF
Output12	0x1BF91549D8703	70F	0x8B81A72468CE6	2BB	0x05043537DB072	5E2	0x20046693FD23A	2BB
Output13	0x1D9EB55CB071B	60A	0x264C5A17BA703	60A	0x458D7586D42BD	7B7	0x21F587BA3570F	60A
Output14	0x892A16096BF94	286	0x97EA766824D8C	286	0x908A4638C8690	286	0x8B8230900DC31	894
Output15	0xA7292EEE7AF47	77F	0x45B865A22102D	671	0xED910677E00EA	244	0xB9F1AF9C302C3	77F
Output16	0x08AB29210A0B6	FF1	0x66F131DB875BA	123	0xF5F9A395C5C1A	FF1	0x035C3B9F6264E	A8C
Output17	0x20E581A9EA8BF	155	0x2B7CBD378A1FB	44A	0x835FF11C3FF63	155	0xD6C0C0A58FD86	96A
Output18	0x768E7BD7B5223	C36	0x8282FAEF4390B	C36	0x7C88BB637C597	C36	0x4E30CF973257E	263
Output19	0xF67313E1E5454	A96	0xBCADB3276FBCA	D92	0x6A4F6314C50F9	11D	0x0DCC176A07468	A96
Output20	0xAB091F6C4C169	476	0x7C6C880401881	476	0x651E3C4FDC40D	476	0x87103D0FEAEBB	5A9
Output21	0xB18D1BC34F78E	4D1	0x182EF712B7CC9	6ED	0x632712BFD8439	536	0x7D1586E9295E2	4D1
Output22	0x5BBCD31929B10	ED3	0x177CCDEF531D3	167	0xC86CF8BD98EEF	167	0x6A7959BA3EC3B	1E5
Output23	0xC64AFD5DFC765	4A7	0xDBFE7B848484D	4A7	0x5124BC71407D9	4A7	0x7317F810CA105	929
Output24	0x34866A3DF2054	E13	0x2B7F37E87E70E	BA7	0x87E4C62AA01E0	E13	0xA9077B94F6670	644
Output25	0x29CA92D156E93	DD4	0x3AE2C3E7ABF63	199	0xBCF4EEF02BA57	199	0x190804CAE2166	D48
Output26	0x46B0D6BFFE3BD	1C3	0x714BB6B1E3FC4	96F	0x5E459BFFC60D8	F95	0xB9A7262288870	96F
Output27	0x74DF62C8A26F6	5CD	0x25B37CDE46C28	A09	0x4111E2836E4B0	090	0x01289C6FC845C	A09
Output28	0xE737155A33FC9	16F	0x8CB957BC0E0A1	16F	0xAEB26C1EB1942	EC4	0x1475F42946F5D	16F
Output29	0x0E0E87DB27242	35B	0xF94895647AA13	EF5	0xCA2ACE10F845E	35B	0x5234DF04DB0C2	7DE
Output30	0x2367AB4D5FA3A	BBE	0x56F8654413BC2	BBE	0xE69CD0B7EE3B4	D08	0x899F4E5205976	BBE

Above attacks the first stage of the mixer, we can also attack the 2nd stage, it looks less impressive but none the less instead of 1 identical byte in 256 we see alot more

Multiplier	15750249268501108917		15750249268501108917		15750249268501108917
Increment	123456789		271162317658692578196538916254707862805		152861171144907760414978380921711611157
Seed	31415		39621334812049856571820243639		9905333703012464142955084471
Output1	0x563B0DB76F45EE	C2	0x0083130D590C82	C2	0x56399A48F66C29	C2
Output2	0x064C612B5E0A48	53	0xAFFAF7C625B7A0	55	0x426A227A69AC8B	55
Output3	0x4727428C93C1A2	85	0x3BFE265FC469DD	34	0xDF23B38901C67B	5C
Output4	0x2745482D3BDBC1	C0	0x28B8CF040CE59E	23	0xF2C12F43D3C5BC	98
Output5	0x6055DF5FCE178A	B2	0xEB4BAAA8C260D7	AD	0x52224A76A9AB37	DF
Output6	0x05DDF19C395C34	2F	0xD65E464613ADA6	A2	0xCF042916ADA3DB	F9
Output7	0xD7AFDA4C165462	6B	0x9FE6DE769FFF8E	1E	0xB816350D4D04B3	1E
Output8	0x2C1FCD6649DB80	16	0x754DC19986C65F	9A	0xCFC556D81AB400	D2
Output9	0xEDE28080A228A6	DC	0xD249BACC4FF71F	15	0xC03E7BF3E7423B	A8
Output10	0xB05EFD2651E59B	C7	0x794EA34DECBCC4	B4	0x9448CB72344E34	D5
Output11	0xD6045D380A566F	2D	0x2BB1BB3BA4DDCF	4A	0xC2956A532B866C	AB
Output12	0x1BF91549D87037	0F	0x8483638585B29A	C1	0xE085893BCA2906	D5
Output13	0x1D9EB55CB071B6	0A	0xB54C78D8826582	0A	0x11B567A953D0E2	2F
Output14	0x892A16096BF942	86	0x073360B2311E2E	86	0x0DD1F9F1A63F27	80
Output15	0xA7292EEE7AF477	7F	0x47C425E350061C	EF	0x06263999BE6442	AE
Output16	0x08AB29210A0B6F	F1	0xB6894B4E36C7FD	52	0xD38C235183235A	F9
Output17	0x20E581A9EA8BF1	55	0x08854CD5C6D98B	F9	0x0CD106F83B232A	55
Output18	0x768E7BD7B5223C	36	0xA1DDA8C4BE25F5	25	0xD0B32BB399C62F	36
Output19	0xF67313E1E5454A	96	0xCEFA2E1F1711E0	59	0x31AF5438CE9EA2	5F
Output20	0xAB091F6C4C1694	76	0x67110439B8E660	76	0x5FD39652AC17B6	A9
Output21	0xB18D1BC34F78E4	D1	0xBB66DD962205EA	F1	0x4BE1680D18C928	B1
Output22	0x5BBCD31929B10E	D3	0xFC0D37393C8893	EA	0xEB5BAA64A129E1	D3
Output23	0xC64AFD5DFC7654	A7	0x87EC987780AF59	FA	0xB31BA2E5123C18	CF
Output24	0x34866A3DF2054E	13	0xDB102ADD9EA7D7	0A	0x8885FA71C030D2	E7
Output25	0x29CA92D156E93D	D4	0xDCDD9AA636484C	7A	0xDD95314A5E5CA0	8D
Output26	0x46B0D6BFFE3BD1	C3	0x1669656599FB38	9E	0xF0C25ACE4573A6	35
Output27	0x74DF62C8A26F65	CD	0x72DFFFE5F48C19	5B	0xC494B0FF934BE3	46
Output28	0xE737155A33FC91	6F	0xBE560D45FF9E8B	C9	0x2D8821EB759B1F	50
Output29	0x0E0E87DB272423	5B	0xCB29008A08DC1A	DC	0xD712484F90D6E4	72
Output30	0x2367AB4D5FA3AB	BE	0x58E6BE59AE99D8	4B	0x7C289593C90335	F1

We can also use this to generate more than 8bit matches, but they are spread out more so it doesnt show up nicely in the first 30 output values.
For example ./pcg-dxsm-64 15750249268501108917 164824157741578258584156320684186520853 3046549733826659956002159287 1000000 16 and ./pcg-dxsm-64 15750249268501108917 123456789 31415 1000000 16 produce a million values, we expect the low 20 bits to match in one case in a million on average but these have over a thousand such matches

Update 2024-03-03: Verifying that our PCG64-DXSM implementation is correct

./pcg-dxsm-64  15750249268501108917 123456789 31415 10 10
 6213575192384761538
  453844501346797651
 5127139872326394501
 2829747299987210688
 6941700003234024114
  422759593322230831
15541880859405869675
 3179485701156339734
17141404421690271452
12708873539510705095

python3
from randomgen import PCG64
g = PCG64(2**128+31415-123456789,123456788/2,variant="dxsm",mode="legacy"); 
for i in range(10): print(g.random_raw())
6213575192384761538
453844501346797651
5127139872326394501
2829747299987210688
6941700003234024114
422759593322230831
15541880859405869675
3179485701156339734
17141404421690271452
12708873539510705095

Update 2024-03-04: Add correlations with equal seeds

Correlations with equal seeds

For the case of equal seeds it must be first said that different pcg64-dxsm implementations mangle the seed in different ways. Thus whats equal in one is not equal in another. We show examples of equal seeds at the LCG level first, this is how the previous examples defined the seed.

Multiplier	15750249268501108917	15750249268501108917
Increment	340282366920937968669293837469431615317	340282366920937968685044086737932724233
Seed	31415	31415
Output1	0x0000000000000000	0x0000000000000000
Output2	0x0000000000000000	0x5542869E099E356E
Output3	0xCA1A90080FC41D1A	0x7CC2BF122332851C
Output4	0x7CC2BF122332851C	0xCB8D59E2E4C1218A
Output5	0x443C52CD5384C766	0x4E16E6CE9CA5478E
Output6	0x4E16E6CE9CA5478E	0x79339B1DB09444E9
Output7	0x6F763957F4353DDB	0x10EE4E400E4F428E
Output8	0x10EE4E400E4F428E	0x5F9CFCB33A3A60EF
Output9	0x4418DDB3F1FE88B1	0x2B907893DCD59873
Output10	0x2B907893DCD59873	0xC984F2B06253372B
Output11	0xA38FA95F35AB6739	0x3E6AE114E83A3619
Output12	0x3E6AE114E83A3619	0x52946D7362EEAA86
Output13	0x0E4B57A85E8670EA	0x04769CEADA5CC799
Output14	0x04769CEADA5CC799	0x2E71EE38F3683F6C
Output15	0xED4F32C1AF264384	0x54C64408D6FABE08
Output16	0x54C64408D6FABE08	0x6B65E69BB07E56F6
Output17	0xDAD0B79C5C6EFC2A	0xAB06E3184A632E04
Output18	0xAB06E3184A632E04	0x88E36EABEA546B63
Output19	0xE563E5FD7697D091	0x2709664DC434EDE3
Output20	0x2709664DC434EDE3	0x6277238F88A1D8D6
Output21	0x74B88A9EDBCF94FA	0xF4B4B8A667253240
Output22	0xF4B4B8A667253240	0x2B0DF01AF7F68886
Output23	0x23906462E660A212	0x9591B3C5A81F1E01
Output24	0x9591B3C5A81F1E01	0x5A67803048FA88EC
Output25	0x8C3FDB1C4F568714	0xEA48F0F9B2EE760B
Output26	0xEA48F0F9B2EE760B	0xA5785DE15372C0A0
Output27	0x6FB95CE223AFFEE0	0xE7F4EEB0FD473A91
Output28	0xE7F4EEB0FD473A91	0x83886CE76BF8FCDA
Output29	0xAE3D319DFBE64AF6	0xEBA429DA1AF5EE38
Output30	0xEBA429DA1AF5EE38	0x2E10BCE8F08A03B7

And another example:

Multiplier	15750249268501108917		15750249268501108917
Increment	18158801084572694679761040473092597769		340282366920937968670446758974038462293
Seed	31415		31415
Output1	0x2	8C57634CF3ADEA3	0x0	000000000000000
Output2	0x4	5E72764CBAC7D8A	0x1	8C57634CF3ADEA3
Output3	0x0	9F5D8F29B240248	0x0	9804DEEE6EB9D0E
Output4	0x4	8BD93FA680EE36C	0x8	9F5D8F29B240248
Output5	0x8	F2CF0B4671A5D87	0x4	2B52509EB19E6B4
Output6	0xE	57EE9B1AB50A947	0x7	F2CF0B4671A5D87
Output7	0x6	E92305B0D002F61	0xE	02E81E3F013B2B5
Output8	0x2	D513919164C9F11	0x5	E92305B0D002F61
Output9	0x2	FCAB132E105A9F2	0x1	C902225001A774F
Output10	0x6	E8A48BD7E37EE0C	0xC	FCAB132E105A9F2
Output11	0x9	A9240D1A1572F73	0x9	64DE7D8E859E804
Output12	0x6	EF0158EC1476E59	0x2	A9240D1A1572F73
Output13	0x9	34A4391489DB1D4	0x4	B66D2865E881D4F
Output14	0xB	634A42C2394888A	0xD	34A4391489DB1D4
Output15	0x8	E03CA0BB984E16B	0x5	5CC7B5271F792FE
Output16	0x5	5F6ACB2C66FDED7	0x5	E03CA0BB984E16B
Output17	0x5	CBEC3EC2C5C03DE	0x3	BF46F487533BDF9
Output18	0x7	C46F2BD428796EA	0xB	CBEC3EC2C5C03DE
Output19	0x0	39A4A7C2A084281	0xA	6A3E2D52E24926E
Output20	0x8	1A17AD6D8142476	0xB	39A4A7C2A084281
Output21	0x1	9223A7740BACC82	0x3	5A8AF9350D8645A
Output22	0x8	8DCB82922B55F8E	0x3	9223A7740BACC82
Output23	0x2	FB541FDDC8E5A86	0xE	DB016F19FE29D2A
Output24	0xC	61E8DBF6577A704	0xC	FB541FDDC8E5A86
Output25	0x8	2733F8F3CD82976	0x8	05A2C8C4F2E88FC
Output26	0x5	5A5329E50934786	0x6	2733F8F3CD82976
Output27	0xA	342EF7724025501	0x2	EBD1F90D6ADC342
Output28	0x1	A8635E212C32A2D	0xD	342EF7724025501
Output29	0x3	F87E671A19911B2	0x1	2B4DEB4213B90C8
Output30	0x7	C8BAC3F018FD61D	0x5	F87E671A19911B2

Now python randomgen + numpy

from randomgen import PCG64
g = PCG64(31415,(310562926860152171542224643668532814649>>1),variant="dxsm",mode="legacy"); g2 = PCG64(31415,(310562926860152171526474394400031705733>>1),variant="dxsm",mode="legacy");
for i in range(30): print("0x" + format(g.random_raw(), "016X") + " 0x" + format(g2.random_raw(), "016X"))

Multiplier	15750249268501108917	15750249268501108917
Increment	310562926860152171526474394400031705733>>1	310562926860152171542224643668532814649>>1
Seed	31415	31415
Output1	0xEE846300E83A27DC	0x7256D1D22864FB31
Output2	0x62B36CDF71DD1743	0x1C2239BDA2F793C6
Output3	0x1C2239BDA2F793C6	0xA25FB4606E9F9955
Output4	0x2C21FF9FA58D879B	0xA90BC860E5BC5E65
Output5	0xA90BC860E5BC5E65	0xD9A217425A7117AF
Output6	0x9B802E5C51C811D5	0x31A30A053121F011
Output7	0x31A30A053121F011	0xDF91893796BE498C
Output8	0xB6F6FCF3CC23CF14	0x27FFF0C3DFDAE842
Output9	0x27FFF0C3DFDAE842	0x5653E5F3D78910B6
Output10	0x043FFA9868AB2B22	0xC77C03211C0201CE
Output11	0xC77C03211C0201CE	0x1A2DC56C688DBDE8
Output12	0x1F3C5871B8AA1618	0x0C12CBDC34DFABB0
Output13	0x0C12CBDC34DFABB0	0xDFD2D074C5127C15
Output14	0xC7FEC3AED543EDE7	0x127120DC1B75A20E
Output15	0x127120DC1B75A20E	0xAA580CD3C44B571D
Output16	0x800C8F5DBF2D5C4B	0xD0D231D10F85BB11
Output17	0xD0D231D10F85BB11	0xB433A15370448948
Output18	0xB678DA426A582A58	0x4E5DC360D27E4857
Output19	0x4E5DC360D27E4857	0x52B4BF17EE5FB665
Output20	0x91A4127B348A8BAB	0xAD7A55AF7A1E62B2
Output21	0xAD7A55AF7A1E62B2	0xAAE73D4A877AEB9E
Output22	0xE6F6BA90233BFF2A	0x537BA3170D10A98F
Output23	0x537BA3170D10A98F	0x71778DBD0A1F1858
Output24	0x22D07B79FDA625E8	0xE03DFB31EDF18C5A
Output25	0xE03DFB31EDF18C5A	0x610809577FF8326B
Output26	0x82BA957BB887F8A1	0x5C430AB7C74B61D9
Output27	0x5C430AB7C74B61D9	0x2C63932EC021D830
Output28	0x0FD2B2B66AF18DD0	0x779C96BF20FAAC65
Output29	0x779C96BF20FAAC65	0x93C6F9F9FA2026BD
Output30	0xFCB0833058A7183F	0x5397DA646711A886

g = PCG64(31415,(328721727944725361000316204835460899641>>1),variant="dxsm",mode="legacy"); g2 = PCG64(31415,(310562926860152171527627315904638552709>>1),variant="dxsm",mode="legacy");
for i in range(30): print("0x" + format(g.random_raw(), "016X") + " 0x" + format(g2.random_raw(), "016X"))

Multiplier	15750249268501108917		15750249268501108917
Increment	328721727944725361000316204835460899641>>1		310562926860152171527627315904638552709>>1
Seed	31415		31415
Output1	0x0	70E695F12FAC872	0x3	847B59A89C311E2
Output2	0x6	C673BB08B64385F	0xB	565D0C976C04FB6
Output3	0x8	D353EED117D7BC2	0xF	C673BB08B64385F
Output4	0x9	EA14B4ED2206371	0x0	06B1E786214329E
Output5	0xB	FE0FB0D66377C3C	0xE	EA14B4ED2206371
Output6	0x8	F48FEB9E21CF90C	0xB	C9512DDD19B53D4
Output7	0x9	FE6F7ED18A90B3E	0xC	F48FEB9E21CF90C
Output8	0x8	554B31E0632BF9F	0x2	82E54E6060CFF52
Output9	0xB	D2444268D86C2E0	0xF	554B31E0632BF9F
Output10	0x6	A77908E23FC25DB	0xE	BE330DEB6411006
Output11	0x7	F5401944A241692	0xB	A77908E23FC25DB
Output12	0x3	78E7CFCA99985EC	0x6	1E2C802A1208A6E
Output13	0xE	B3C49C3FC1B80BD	0xF	78E7CFCA99985EC
Output14	0x0	CE124E21F2F325C	0xA	F5923BC63DFE91F
Output15	0xF	2DAAFA5A3C0D13C	0x4	CE124E21F2F325C
Output16	0x1	4B999F605D4C93B	0xD	00E469E490FE6A4
Output17	0x3	4CAE614E3FBF077	0xC	4B999F605D4C93B
Output18	0x3	BA4BA78F570240F	0xE	5AFD46A52CC2195
Output19	0xD	935DBA5BB7508AA	0x6	84B89E8D66E1609
Output20	0xE	B3D1F1A9262C709	0x7	8C2397180142176
Output21	0xB	7AEF7982643BADC	0xB	B3D1F1A9262C709
Output22	0x1	C258C8417669CD1	0x2	542033E81E98834
Output23	0x4	81A95DC0F79E1A6	0xC	C258C8417669CD1
Output24	0x1	227C217600F6ECC	0xF	3E4B6B862CC276A
Output25	0x6	F0860C9B8E5E021	0xD	227C217600F6ECC
Output26	0xC	72BDAFE8A09D355	0x5	3D268E5FE4D6083
Output27	0x2	69759DA86213109	0x7	72BDAFE8A09D355
Output28	0x8	46C4CD6EC69A47D	0xB	D9032549CBE8217
Output29	0xE	3FC3F0A98C99CF1	0x1	46C4CD6EC69A47D
Output30	0xC	7C7EC7A33D0292F	0x3	4D9727EA1FBECFB

An example of the upper part of the LCG in PCG64-DXSM being modified with identical seed in numpy/randomgen

g = PCG64(31415,(292528383318202715988945170563770901305>>1),variant="dxsm",mode="legacy"); g2 = PCG64(31415,(140421743439297021051919259480919575173>>1),variant="dxsm",mode="legacy");
for i in range(30): print("0x" + format(g.random_raw(), "016X") + " 0x" + format(g2.random_raw(), "016X"))

Multiplier	15750249268501108917		15750249268501108917
Increment	292528383318202715988945170563770901305>>1		140421743439297021051919259480919575173>>1
Seed	31415		31415
Output1	0x97F9738805252	FBF	0x3A48F167D2DF3	6B7
Output2	0x964B11CABAA4E	501	0x88DECF43C2B52	E8D
Output3	0xB6AC157505409	594	0x73D73CEDB2296	501
Output4	0xCEC2A042E362F	2A4	0xE36B6340C1CDC	FC3
Output5	0xD0A2096F73ED9	41A	0x28A909ED3D7C7	2A4
Output6	0x7127C22885CC5	4A3	0x2F3751079E58B	CB4
Output7	0x90A13F2459C9B	6CE	0x5EB33E911942C	F94
Output8	0x496FE1695C009	C55	0xE05801B1AA7F2	AF6
Output9	0x80126568D4AA5	FDF	0x8204715BB013E	169
Output10	0x85BC9BD98E5DE	FA4	0x53456ECD3C705	21D
Output11	0xA7380BD92D338	178	0x901391E7BC83E	69A
Output12	0x3E324902274F1	0C2	0xC1698DE771445	A88
Output13	0xA5584406CB702	00A	0x54EC4E180FAC9	0C2
Output14	0x736CBD359BEE5	8B5	0xF43324BCA3F47	26E
Output15	0x7F72C7D6A6C6B	778	0x48C873AED7047	0E2
Output16	0x7CAD149EE868F	197	0x6F3A19369FE36	048
Output17	0x30B0B3F452E1F	475	0x31B0038A96D97	197
Output18	0x36514492F155F	FD8	0x2586DFDD55D21	2EF
Output19	0x0CE19FA5F24DA	C42	0x88058C6DBD897	FD8
Output20	0xE5C9DECAE4A12	C0D	0x5040EC24F3386	95E
Output21	0xE43F8183F92C1	450	0x064D651C1466D	7C3
Output22	0x4E84BB152C42A	D6C	0x36D952B70BFA0	870
Output23	0xA4CD58BA22302	BB5	0x3C74E5804A742	D6C
Output24	0x360F2AB984C9C	065	0x33E5305BEC81D	936
Output25	0x50B420A99712E	267	0x0E3141A4260BF	B79
Output26	0xD5F0C6071AE99	AEB	0xA9AEDA617F487	C03
Output27	0x8354DDF7179B0	638	0x4027BB1478F1E	C11
Output28	0xCE21CF50C0F0E	D3F	0xAA188AB00050F	05E
Output29	0xD08A6974E1276	809	0x21370D1D12353	313
Output30	0x2D33EC6185398	A0C	0x5F439A9086378	B2C

Filed under: Cryptanalysis,Pseudo random number generators — Michael @ 18:08

Comments (5)

Breaking pcg-dxsm (no implementation)

All PCG implementations seem to allow more or less easy computation of the seed, but a new PCG implementation has appeared in 2019 or so.
PCG-DXSM, the source code tells us “This is a new, more powerful output permutation” and “This latter aspect means that the scrambling applied to the high bits depends on the low bits, and makes it (to my eye) impractical to back out the permutation without having the low-order bits.”
Ok. lets try
Heres for reference, what this is about:
multiplier is a odd 64bit constant, SHR1 is half the state bits, SHR2 is quarter the state bits and SHR3 is 3 eights of the state so 64, 32, 48 for the 128/64 bit variant and 32, 16, 24 for the 64/32bit one. inc is forced to be odd too.

    state = state * multiplier + inc;

    l1 = state | 1;
    h1 = state >> SHR1;
    h2 = h1 ^ (h1 >> SHR2);
    h3 = h2 * multiplier;
    h4 = h3 ^ (h3 >> SHR3);
    return h4 * l1;

l1 is forced to be odd as we can see, so what do we know if pcg_dxsm returns 0 ? When is X * (2Y+1) = 0 ? Yes when X is 0 so h4 has to be 0 (yes i jump over some details of rings) and what can h3 be ? only 0, and h2 ? again only 0 and h1 ? yes its 0.
So when pcg_dxsm() return 0 you just learned half its internal state.
This can be extended a bit, for example when it returns a odd number h4 must be odd, when it returns a number that contains a factor of 2^x then so must h4.

Can we compute the seed and increment from this ? I did not try, but i think so.
Just consider this: we decide how many bits we want to look for in the data and we brute force the rest. (there quite possibly could be an easier way, thats just the obvious one)
if we want 16bits to be from the data we would look for about 100k output data values, look for the values that are multiplies of 65536, these points would have h4 a multiple of 65536 and thus the low 17bits of h4 would be known. If we now brutefoce the 32-17 that is 15 low bits of the state and increment, these together fully determine, first all the low 15bits of all states and together with that the cases where we know the low 17 bits of h4 we now can invert the multiply and find the rest of h4. For each known h4 its trivial to find the upper half of the state. And the rest is identical to finding the state and increment of a truncated LCG and after the upper bits are found the lower can be trivially found from the h4*l1.
As i have not implemented this it is very possible that i have missed something or made some mistake.

Btw i think PCG and other PRNGs are really great little math exercises. They are also good PRNGs as long as one understands their limitations.

Filed under: Cryptanalysis,Pseudo random number generators — Michael @ 02:44

Comments (0)

2024-01-12

SFC64

As i was looking for the best PRNG ;) i found SFC64, its fast, seems to pass all tests, and it was fun to find a way to compute the seed from a few output values.
But the more one looks the more one finds, there are many other great PRNGs. I would stay away from hyped up generators, like the mersene twister was in the distant past.

Now about SFC64, heres a simple implementation. It can also compute values backward.

And for fun, heres an implementation that will turn 10+ values back into the 3 seeds and counter used to create them.
The way this works is that first we guess the 16-19 MSB of the a or b register. Next we assume that the corresponding MSB of the counter are 0. Because its 64bits, thats reasonable and pretty difficult to be otherwise. From this and a = b ^ (b >> 11) and tmp = a +b + counter++; we can estimate teh MSBs of the future and past a and b registers (they slowly become less exact due to the counter but thats insignificant over the distance we need them)
For the 8 rotation states we then invert b = c + (c << 3); for the MSBs we have (which has 9 solutions and iterating over them gives us 3 extra bits knowledge of c). The 8 rotation states are iterated over in 2 forward steps of 3 followed by a backward step of 5. This way we minimize the distance we move away from the start point which limits effects of the unknown LSBs and also limits the needed amount of data. In addition the 3,3,-5 pattern splits the previously determined part of the c register so that the lower part of the MSB becomes shifted into the MSB. This allows us to check the lower part of our data from the c register together with the MSB of the b register at that point and the known output data. This is crucial to limit the otherwise exponential increase of computations with each part of c that is iterated over. Towards the end of the c register, we then compute the a,b and counter from it and the known output and if it matches, backstep to counter =1 to output the seed at the start. It would also be possible to use a heuristic in guessing the initial a/b values, so as to not search the 17-19 bit space but it seems the bad ones get rejected naturally quickly so this lead to little gain and its already quite fast.
I do realize after writing above that this description is probably hard to understand. I should expand this and draw diagramms, but ATM i have no time so please read the source :)

Example: of finding the seeds after 1000 rounds

./sfc64 0x123456785 0x978653 0xCCAADDEE0F 1000 30 16 
0x8A9912C022EA7402
0x26971AE36A1CEACA
0x97B4CF575274B9F4
0xC9212C12EEE58E49
0x9B45979745BB2E20
0x857B5EEB83507D6D
0x243E719FF5356874
0xE6EB4E22DDDEDCA6
0xAF0DC4AF0234A763
0x9946B8BD21C71BE9
0x14D42990A607161B
0x66A19FF0771F930C
0xE3E580CA96D08FD3
0xDAED73C40D44E397
0x90A0E96731283123
0x183075440F78DA33
0xE46ACCC1AECD9C6C
0x1CA82D6053C47933
0xC707C633E34377CA
0xF60FC6DB95DE2631
0x0B2342552B1B4912
0xC32F27E766B148FD
0x055FDF9DC44345CF
0xADFD892D38628512
0x5AA6881F0C6E3ADE
0x72FD99BE0899B07C
0x6F31D2773AB970F6
0xC07850CC6EFD7868
0x5EC7E6C020A407CA
0xF048FCEDEA87BE0A


time ./sfc64-breach  0x8A9912C022EA7402 0x26971AE36A1CEACA 0x97B4CF575274B9F4 0xC9212C12EEE58E49 0x9B45979745BB2E20 0x857B5EEB83507D6D  0x243E719FF5356874 0xE6EB4E22DDDEDCA6 0xAF0DC4AF0234A763 0x9946B8BD21C71BE9 0x14D42990A607161B 0x66A19FF0771F930C 0xE3E580CA96D08FD3 0xDAED73C40D44E397 0x90A0E96731283123  0x183075440F78DA33 0xE46ACCC1AECD9C6C 0x1CA82D6053C47933 0xC707C633E34377CA 0xF60FC6DB95DE2631 0x0B2342552B1B4912 0xC32F27E766B148FD 0x055FDF9DC44345CF 0xADFD892D38628512 0x5AA6881F0C6E3ADE 0x72FD99BE0899B07C 0x6F31D2773AB970F6 0xC07850CC6EFD7868 0x5EC7E6C020A407CA 0xF048FCEDEA87BE0A
attempt 0
Found (a=0x0000000123456785 b=0x0000000000978653 c=0x000000CCAADDEE0F counter=0x0000000000000001 original_counter=1002
step 5121675

real	0m0.083s
user	0m0.071s
sys	0m0.012s

Filed under: Cryptanalysis,Pseudo random number generators — Michael @ 02:09

Comments (0)

2024-01-08

Finding the increment of PCG/LCG/truncated LCGs and some (for me) surprising consequences about their structure.

For plain LCG the problem is trivial and an implementation shown in my last post. PCG falls in 2 categories, the PCG-XSL-RR which is based on an LCG, this was solved in the Paper by Bouillaguet, Martinez and Sauvage.
The remaining PCG generators are different and simpler, they are based on an truncated LCG. The problem of finding an increment and seed is marginally harder as it is for the corresponding truncated LCG. It can be done using Lattice reduction (even some half bastardized version seems to work). The problem and the surprise for me really is that the solution in general is not unique. I guess it was a surprise after reading in the original PCG paper how additional random streams can be created by changing the increment. This is really not true. Changing the increment does not really produce “different” random data, it just offsets it.

Let me first show an example with a plain LCG (using the multiplier 6364136223846793005 from PCG with 64bit state)

Seed	314159	314160
Increment	1442695040888963407	13525302890751722019
Output0	1758213515780721042	1758213515780721043
Output1	8369501526408240633	8369501526408240634
Output2	14286244372924288276	14286244372924288277
Output3	12908628925188209107	12908628925188209108
Output4	17643264774698309734	17643264774698309735
Output5	10761879962653451581	10761879962653451582
Output6	11615400545074122760	11615400545074122761
Output7	5243018629741398711	5243018629741398712
Output8	1055069346400300154	1055069346400300155
Output9	14628082432468752577	14628082432468752578

This is not a odd exception, no, for every choosen LCG and choosen offset (here it is +1) theres a seed and increment that will produce a offset sequence of random numbers.
You can use lcg-breach to find them by just asking it to find parameters for a modified sequence.
Now thats LCGs, What about truncated LCGs ? its worse because they discard the low bits. You get sequences that can be exactly equal for MANY values even with different increment and seed. And that is what the problem is in determining the increment for truncated LCGs and the corresponding PCGs. For PCG-XSL-RR the low LCG bits are XORed in so there remains a effect on the output and its possible to determine the increment but for 128/64 and 64/32 PCG-XSH-RR the low bits of the state do not affect the output as long as their effect stays limited to these low 27bit (for 64/32 bit) in future states they cannot be computed from the output. Or in other words the output from 2 PCG-XSH-RR with differing increments can be identical for a long time.

On top of that, there is also a set of seed,increment pairs that produce constant output.
consider this:

LCG         : out[i+1] = out[i] * m + a mod n
what we want: C        = C      * m + a mod n

just solve it:

C - C * m = a mod n

choose the constant C you want and this will tell you the increment a that will keep the generator at that constant when its started on it.

and how do we now create a offseted LCG ?
simply add it with the constant above

out[i+1]     =  out[i]      * m + A     mod n
           C =           C  * m +     a mod n
out[i+1] + C = (out[i] + C) * m + A + a mod n

And yes LCGs of the same multipler and modulo seem to form an abelian group with addition.

Filed under: Cryptanalysis,Pseudo random number generators — Michael @ 18:06

Comments (0)

2024-01-06

Breaking LCGs

Just for fun, heres a program to generate random numbers using a Linear Congruential Generator from a specified multiplier, increment, seed and number of bits.
And a second one that will tell you the multiplier, increment, seed for a given 3 output numbers from above.

Have fun encrypting important messages

./lcg 64 6364136223846793005 1442695040888963407 314159 5 1758213515780721042 8369501526408240633 14286244372924288276 12908628925188209107 17643264774698309734
$ ./lcg-breach 64 1758213515780721042 8369501526408240633 14286244372924288276 Seed: 314159 0x000000000004CB2F Multiplier: 6364136223846793005 0x5851F42D4C957F2D Increment: 1442695040888963407 0x14057B7EF767814F

Filed under: Cryptanalysis,Pseudo random number generators — Michael @ 01:00

Comments (0)

2024-01-04

Breaking the PCG-XSH-RR PRNG

How did this start ? Someone found a bug or 2 in the libavutil/eval.c random(), so i proposed a patch replacing it by Marsaglias 64-bit KISS PRNG. Nicklas Haas then ask me why iam not using PCG yesterday. And i didnt know really what PCG is :) so i looked found the paper found a wegpage that was down the exact day i tried to look at it (yeah its up again now). And a wikipedia page taking about “It has been shown that it is practically possible (with a large computation) to recover the seed of the pseudo-random generator given 512 consecutive output bytes.”. That triggered me a bit ;)

The paper about PCG is 58 Pages so reading that is not an option. But we skim it and look at the pretty pictures, theres one that marks several PCG variants as insecure. Luckily the one listed at wikipedia (PCG-XSH-RR is not marked as insecure) In fact there are 2 PCG-XSH-RR, one with 64bit state and 32bit output and one with 128bit state and 64bit output. So i tried them. Theres also a paper about breaking PCG-XSL-RR which differs by some shifts it seems from XSH. That paper talks about also recovering the increment and that for that it needs “512 bytes of challenge input;in the worst case, the process takes 20 000 CPU hours” ugh, i dont have that. Luckily we know the increment as its not a variable in the implementation and with that, the paper tells us “breaking XSL-RR needs 23 core minutes”. But we took XSH-RR because it was on wikipedia :) and the original paper also recommends it

6.3.1 32-bit Output, 64-bit State: PCG-XSH-RR
Here the design goal is to be a good all-purpose random number generator. The intent
is to balance speed with statistical performance and reasonable security, charting a
middle-of-the-road path. (It’s the generator that I recommend for most users.)”

So we are not totally wasting our time but we can have fun exercising breaking a “not insecure” PRNG with “reasonable security”.

A few hours later, am i lucky ? am i smart ? or is the PCG-XSH-RR a insecure generator, i dont know, I think i maybe picked the easier one but still. These following need 3-4 output values from the PCG-XSH-RR to find and verify its state, from which all future and past output can be computed.
For the 64/32bit variant we compute all tables at runtime, for the 128/64 variant we hardcode the 152 entry table needed as it needs a few minutes to be computed. But the table doesnt depend on the seed nor on the increment, just a new multiplier would need a new table.
The actual execution for both including loading the program, generating the random numbers finding the seed again and printing it all together takes less than 100ms
Code is here

time ./pcg-xsh-rr-64-32-breach $((0x1234578ACDEF11)) INV 0xC097EF87329E28A5 0x1 gcd=0x1 Seed = 0x1234578ACDEF11 PCG output 0 = 0x90DA0D79 PCG output 1 = 0xCF0EAEE6 PCG output 2 = 0x423D36BE PCG output 3 = 0xAE152565 PCG output 4 = 0x2C48B4D4 PCG output 5 = 0x9EE6CC50 PCG output 6 = 0x903FC56A PCG output 7 = 0xC85B3AED PCG output 8 = 0x87B29579 PCG output 9 = 0xA486F671 Table 0 0x00000000000000005851F42D4C957F2D 0x0000000000000001 Table 1 0x000000000000000008F5DC87E5C07D87 0x0000000000000003 Table 2 0x00000000000000000148A921ACEF6819 0x000000000000001D Table 3 0x00000000000000000006C363D4CB5B28 0x00000000000000C8 Table 4 0x00000000000000000002BCFA0DFD0A8F 0x000000000000262B Table 5 0x00000000000000000001738A552BC485 0x00000000000071B9 Table 6 0x000000000000000000002A1A9C5A7E7B 0x000000000000BD47 Table 7 0x0000000000000000000007652A02ADCE 0x00000000000635C6 Table 8 0x0000000000000000000002445FB59459 0x000000000024855D Table 9 0x0000000000000000000001AC54D3A396 0x00000000008BDFAE Table 10 0x00000000000000000000011449F1B2D3 0x0000000000F339FF Table 11 0x00000000000000000000007C3F0FC210 0x00000000015A9450 Table 12 0x000000000000000000000060733D935D 0x00000000031C82F1 Table 13 0x000000000000000000000044A76B64AA 0x0000000004DE7192 Table 14 0x000000000000000000000028DB9935F7 0x0000000006A06033 Breaching spin:......................................................................................................................................................................................................................................................................................................................... Candidate 0x48EAFC5649EE9492 for 90DA0D79 CF0EAEE6 423D36BE found Seed = 0x1234578ACDEF11


real	0m0.074s

user	0m0.074s

sys	0m0.000s

And the PCG-XSH-RR-128/64 one:
time ./pcg-xsh-rr-128-64-breach $((0x1234578ACDEF11)) INV 0x98ABC8B0716EAC8D 0x1 gcd=0x1 Seed = 0x1234578ACDEF11 PCG output 0 = 0xE49F66304A892198 PCG output 1 = 0x58C80AA8C2F10A26 PCG output 2 = 0xD2B380859B808368 PCG output 3 = 0x60715FDB8EE68956 PCG output 4 = 0xB0D0AAF3073F2EDA PCG output 5 = 0xC5876DE56C99EDB3 PCG output 6 = 0x68371CCDC061F3D PCG output 7 = 0x7118EC9199028ED5 PCG output 8 = 0xAC0DAAF55D9A009C PCG output 9 = 0x709AD75080EDE32E


Breaching spin:.................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

Candidate  0x5928FB686BFAEF01 for E49F66304A892198 58C80AA8C2F10A26 D2B380859B808368...............................................................................................................................................................................................................................................................................................................................................................................................................................................

Breaching spin:...................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

Candidate  0xF0BA68A1B616E6B1 for 58C80AA8C2F10A26 D2B380859B808368 60715FDB8EE68956

found Seed = 0x1234578ACDEF11

real 0m0.089s user 0m0.069s sys 0m0.020s

Maybe its a great PRNG, i dont know i did not look at that aspect, but its not secure if i can break it after a few hours work :)

Filed under: Cryptanalysis,Pseudo random number generators — Michael @ 21:58

Comments (0)

2023-05-18

Hardware password managers, accelerometers and random data

I was looking for a way to securely manage my passwords, anything storing my passwords with a 3rd party or all accessible to my computer fail IMHO. Also it had to be practical, something limited to 5 passwords is not. Be convenient, something where one would have to search for a password in a list by hand fails that. And it needed to have a screen so one knew what one approved. A simple one button USB stick does not qualify either as one does ultimately not know what one actually approved. The only device i could find was the mooltipass. So i bought one. It is a cute little thing storing all passwords both securely and also conveniently. And being a nerdy geeky person, i of course had to play with and analyze it. The first thing that sprang out to me was its random number generator

Can we exploit it ?

So i looked at its random number generator until i found something. It uses the 2 LSB of each of the 3 axis of an LIS2HH12 accelerometer. This generator was tested with the DIEHARDER battery of tests by its creator before me. I was at first not able to run the DIEHARDER tests because they needed more random data than the device could generate quickly. I tried various tests, simple things like simply trying to compress the data showed no anomalies. The first anomaly i found, i believe was through looking at the frequency spectrum in audacity. To my eye it looked like there was a bias between high and low frequencies. The next step was checking the correlation of various bits. And indeed when one looked at 2 bits and the previous 2 bits from the same channel They where equal 3% more often than they should be. I guess i could have stoped here, but i didn’t :) So i looked at the distribution of matching bits, (3% of these shouldn’t be there and we dont know which of 100 are the 3 bad ones). These 3 could be randomly distributed of course. But by now i had enough data to run some of DIEHARDER and while most tests passed, some of its tests failed for me. I have to say though little things in the command line of DIEHARDER can lead to unexpected results.

The original findings on the 3% anomaly

I simply counted the number of times each 2bit matches the previous
2bits of the same channel and this occurs about 3% more often than it should.

With 1 mb of data:
Channel 0 [1.026829 : 1.980986 : 0.992185]
Channel 1 [1.031329 : 1.978283 : 0.990388]
Channel 2 [1.039171 : 1.974176 : 0.986653]
Average   [1.032443 : 1.977815 : 0.989742]
All 3     [0.968019 1.010415 1.052310 1.111489]

/dev/random
Channel 0 [1.000765 : 1.998530 : 1.000705]
Channel 1 [0.997429 : 2.001644 : 1.000927]
Channel 2 [0.997357 : 2.003375 : 0.999268]
Average   [0.998517 : 2.001183 : 1.000300]
All 3     [1.001617 0.999233 0.997958 0.995425

This allows relatively reliably distingishing the mooltipass random numbers from /dev/random with 10kb of data

Periodic anomalies

Looking again at the randomdata. The 3% extra repeats within a channel occur 32 samples apart (that is 24 bytes in the stream or 192 bits). These locations sometimes shift around but preferably occur at indexes 31,63 and 95 of the 96 sample block. When such a run of anomalies breaks from teh same index, the new index tends to be the next in the set {95,63,31}. With these patterns it is possible to reliably distinguish as little as 100-200 bytes from random data. That said, the randomness of this is still plenty for a password and the average human would be way worse generating random data. Care though may need to be applied if this is used for other purposes than passwords. For example DSA signatures are notorious for being sensitive to biases in the used random number generator. I reported the anomalies in the RNG in January of 2023. It was fixed on Apr 18th 2023 with 49359dfc52cdfe743000ac512092085328d5f15b. Software to detect the specific pattern reliably as well as 2 small test samples is availble at randomtests

The original findings on perodic anomalies

Heres how this compares to /dev/random
dd if=/dev/random of=/dev/stdout count=1 bs=1000 status=none | ./mooltitestwalker
mooltiness:     0.89 expected: < 1 in 68% cases, < 2 in 95%, < 3 in 99.7%

dd if=/dev/random of=/dev/stdout count=1 bs=10000 status=none | ./mooltitestwalker
mooltiness:     0.25 expected: < 1 in 68% cases, < 2 in 95%, < 3 in 99.7%

dd if=/dev/random of=/dev/stdout count=1 bs=100000 status=none | ./mooltitestwalker
mooltiness:     0.04 expected: < 1 in 68% cases, < 2 in 95%, < 3 in 99.7%

dd if=/dev/random of=/dev/stdout count=1 bs=1000000 status=none | ./mooltitestwalker
mooltiness:     2.99 expected: < 1 in 68% cases, < 2 in 95%, < 3 in 99.7%

dd if=/dev/random of=/dev/stdout count=1 bs=10000000 status=none | ./mooltitestwalker
mooltiness:     0.08 expected: < 1 in 68% cases, < 2 in 95%, < 3 in 99.7%

and now the random data from mooltipass

dd if=~/mooltirandom.bin-copy of=/dev/stdout count=1 bs=1000 status=none | ./mooltitestwalker
mooltiness:     3.16 expected: < 1 in 68% cases, < 2 in 95%, < 3 in 99.7%

dd if=~/mooltirandom.bin-copy of=/dev/stdout count=1 bs=10000 status=none | ./mooltitestwalker
mooltiness:     4.21 expected: < 1 in 68% cases, < 2 in 95%, < 3 in 99.7%

dd if=~/mooltirandom.bin-copy of=/dev/stdout count=1 bs=100000 status=none | ./mooltitestwalker
mooltiness:    12.91 expected: < 1 in 68% cases, < 2 in 95%, < 3 in 99.7%

dd if=~/mooltirandom.bin-copy of=/dev/stdout count=1 bs=1000000 status=none | ./mooltitestwalker
mooltiness:    37.46 expected: < 1 in 68% cases, < 2 in 95%, < 3 in 99.7

dd if=~/mooltirandom.bin-copy5 of=/dev/stdout count=1 bs=10000000 status=none | ./mooltitestwalker
mooltiness:   115.95 expected: < 1 in 68% cases, < 2 in 95%, < 3 in 99.7%

As you can see with just 1000 bytes we are already more than
3 standard deviations away from random data. And after that it very quickly
becomes something that a random number generator would not generate in the lifetime of
the universe not even if you fill the whole universe with random number generators should
you see this sort of statistics once

Heres the test results for 100,200,300,400,500 bytes with mooltitestcycler

for i in `seq 100 100 500` ; do dd if=~/mooltirandom.bin-copy of=/dev/stdout bs=$i count=1 status=none | ./mooltitestcycler ; done
moolticycles:     3.00, this or larger is expected 0.27 % of the time in random data.
moolticycles:     4.58, this or larger is expected 0.00046 % of the time in random data.
moolticycles:     6.00, this or larger is expected 2E-07 % of the time in random data.
moolticycles:     6.93, this or larger is expected 4.3E-10 % of the time in random data.
moolticycles:     7.75, this or larger is expected 9.5E-13 % of the time in random data.

Same but with another testsample to make sure this is not a one off

for i in `seq 100 100 500` ; do dd if=~/mooltirandom2.bin of=/dev/stdout bs=$i count=1 status=none | ./mooltitestcycler ; done
moolticycles:     3.00, this or larger is expected 0.27 % of the time in random data.
moolticycles:     4.58, this or larger is expected 0.00046 % of the time in random data.
moolticycles:     6.00, this or larger is expected 2E-07 % of the time in random data.
moolticycles:     6.93, this or larger is expected 4.3E-10 % of the time in random data.
moolticycles:     7.75, this or larger is expected 9.5E-13 % of the time in random data.

In fact i notice now the results are exactly the same, interresting
cmp  ~/mooltirandom2.bin ~/mooltirandom.bin-copy
/home/michael/mooltirandom2.bin /home/michael/mooltirandom.bin-copy differ: byte 1, line 1


heres the control test with /dev/random
for i in `seq 100 100 500` ; do dd if=/dev/random  of=/dev/stdout bs=$i count=1 status=none | ./mooltitestcycler ; done
moolticycles:     0.00, this or larger is expected 1E+02 % of the time in random data.
moolticycles:     0.31, this or larger is expected 76 % of the time in random data.
moolticycles:     0.58, this or larger is expected 56 % of the time in random data.
moolticycles:     0.44, this or larger is expected 66 % of the time in random data.
moolticycles:     0.77, this or larger is expected 44 % of the time in random data.

and a bigger sample to show the runs:

dd if=~/mooltirandom.bin-copy5 of=/dev/stdout bs=2000000 count=1 status=none | ./mooltitestcycler
Run 1731 at phase 31
Run 7496 at phase 95
Run 3323 at phase 63
Run  196 at phase 31
Run  195 at phase 95
Run 3540 at phase 63
Run 2459 at phase 31
Run  777 at phase 95
Run  775 at phase 63
Run  195 at phase 31
Run  583 at phase 95
Run  778 at phase 63
Run 1585 at phase 31
Run 1126 at phase 95
Run  971 at phase 63
Run 5047 at phase 31
Run 7141 at phase 95
Run 1741 at phase 63
Run 4643 at phase 31
Run  197 at phase 95
Run  577 at phase 63
Run  197 at phase 31
Run 2907 at phase 95
Run 4040 at phase 63
Run 1572 at phase 31
Run  196 at phase 95
Run 2903 at phase 63
Run 1355 at phase 31
Run  195 at phase 95
Run  196 at phase 63
Run  584 at phase 31
Run  143 at phase 95
Run   51 at phase 63
Run  195 at phase 31
Run 3099 at phase 95
Run  196 at phase 63
Run 1164 at phase 31
Run 2445 at phase 95
Run 2589 at phase 63
Run 4553 at phase 31
Run 7145 at phase 95
moolticycles:   499.67, this or larger is expected 0 % of the time in random data.

Shaken not Stirred

A 2nd issue i found and reported on 7th feb 2023 is that when the device is violently shaken, the accelerometer saturates and clips at 2g, this makes the 2 LSB of the affected channel(s) be 00 or 11 more often than expected in a random stream of data. So please dont use the mooltipass while doing high G maneuvers in a fighter jet or any other activity that subjects it to high g forces. Also with some effort one can shake a 3 axis accelerometer so that all 3 axis clip at the same time. There also may be a delay between the generation and use of random data. Ideally when the full 16bit sample clips it should be discarded and not used. While discarding one kind of sample that was equally frequent, introduces a bias, the clipping cases are not equally frequent. They either do not occur at all in a still environment or occur disproportionally frequent.

Random now?

Except these, i found no further flaws in the random data. Personally i would recommend to use some sort of hash or other cryptography to mix up the accelerometer bits. Heating or cooling (in my freezer with long USB cable) the device did not introduce measurable bias and also feeding ~60 mega byte stream of data into the PRNG NIST-Statistical-Test-Suite did not show any anomalies after the fix. Of course one can only find flaws in such data streams, never proof the absence of flaws. Also it must be said that i could not find any reference by the manufacturer of the chip to randomness of the LSBs. So while AFAIK many devices use accelerometer data as source of random data, there seems no guarantee that a future revision of that chip doesn't produce less random bits. What i can say, is thus just about the device i looked at and for that, the random data is much better than what a person would generate by "randomly" pressing keys. People are very biased in what characters they use in passwords even when they try to be random. Also passwords are generally too short for this anomaly to allow distinguishing 1 password from another with a unbiased RNG. A password that one believed contained n*192 bits of entropy really only contained n*190 before the fix. All this assumes the device is not violently shaken while used.

Audacity spectra

Took me a bit to find and restore the original spectrum i saw. While doing that i also noticed that using signed 8bit results in a spectrum biased the other way. The original ones are unsigned 8bit.


mooltipass pre fix spectrum	/dev/random control spectrum	mooltipass post fix spectrum

Audacity spectra with more data and in signed 8bit ints


recreated mooltipass pre fix spectrum	/dev/random spectrum	mooltipass post fix spectrum

Dieharder testsuite

Ill write a separate article about dieharder later, the thing is finicky and this is already quite complex

Updated this article multiple times (last on 2023-05-24) to include more details, pictures and minor corrections
Full disclosure, i was offered a reward for finding and reporting the bug.

Filed under: Cryptanalysis,Hardware,Off Topic,Security — Michael @ 22:52

Comments (0)

2008-11-30

Pseudo random number generators 2

After the little bit of analysing PRNGs in the last blog post, which PRNGs exactly seem good?

Fastest PRNGs

If one either needs a very fast PRNG or one has some means to easily test that the output is good enough (like looking to see if a noise filter produces no vissible aritifacts) then folllowing could be considered.
Linear congruential generators, Additative lagged fibbonacci generators or even just an array of random numbers that gets used repeatedly, …
Each of these has its serious flaws, LCGs have very poorly performing LSBs, that is the lowest n bits never have a period larger than 2ⁿ. ALFGs contain number triplets (the ones spaced like the taps used in the PRNG) that when used as coordinates in a cube would all fall in a single plane. Still these generators are often good enough, and have been used for a long time and are widespread, that is all C libs ive seen use either of them for rand() for examle.

Good and fast PRNGs

If one wants a generator free of big flaws and that is still reasonably fast then the number of choices goes down, first there is
4 tap additative LFG (x_i = x_i-55 + x_i-119 + x_i-179 + x_i-256 mod 2ⁿ) this one passed all of testu01 but has a obvious linear dependancy between its values.
Combined generators, these generators combine the output of several simple generators and generally pass all tests, many of them though are slow like CLCG4 which combines 4 LCGs, combined generators like f_i = g_i + h_i also have the flaw that f_i – f_i+p = h_i – h_i+p when p is the period of g. This also means that f_i – f_i+p – (f_i+q – f_i+p+q) = 0 when p and q are the periods of h and g. Thus all combined generators which are made of constituent generators with “bruteforceable periods” are cryptographically likely very weak. A combined generator that is reasonable fast is KISS99.
Multiplicative lagged fibonacci generators, like for example (x_i = x_i-55 * x_i-24 mod 2ⁿ) these generators also tend to pass all tests as long as only the most significant bits are used, that is for example the top 32 bit of a 64 bit generator. The least significant bit of such a generator is always 1, its next bit behaves like a LFSR identically to the least significant of an additative or xor based LFG. One trick to get rid of the least significant bit without ending up with just 31 or 63 bits is to use 2*a*b+a+b instead of a*b, this works because a and b are (have to be) odd in a normal MLFG and (2*A+1)(2*B+1) = 4AB + 2A + 2B + 1

Best but slow PRNGs

“Anything” based on AES, SHA-1, MD5, …

Bad PRNGs

These are PRNGs that are both slower and worse in terms of their output than others, examples are
Mersene twister, becuase its output is purely based on the XOR of previous bits thus being not nearly as good as some pretend it to be while it also isnt nearly as fast compared to other properly implemented PRNGs.
Combined generators with lots of modulo by non power of 2, these are slow …
Weak generators from which many output values are thrown away, again these are slow …

Filed under: Cryptanalysis,Pseudo random number generators — Michael @ 02:08

Comments (13)

2008-11-22

Pseudo random number generators

I guess most developers have once or more than once run into the question of, “which PRNG should one use”. The awnser, or more precissely my awnser to this, would be it really depends on for what one plans to use it.
There is no perfect generator, there are fast ones, and there are good ones, but the best arent the fastest. One has to choose depending on ones needs

If one wants to generate white noise for a human listener then the best generator simply is the fastest that produces perfectly sounding white noise, it really doesnt matter if it has some statistical defects or not.
If one wants to use the random numbers for cryptographic purposes, a perfect generator that has been extensively tested and no flaws found is needed.
If one wants to test a scientific theory by simulation, one needs a generator that doesnt cause a wrong result from the simulation, but as the correct result isnt known, one cant easily pick based on this criteria. Thus one has little alternative to picking a generator that has few statistical defects and is fast enough for the amount of numbers needed. Or better even run the simulation 2-3 times with very different generators

So which generators are there and what defects do they have? This actually is rather easy to awnser, or then maybe not ;) well, there is George Marsaglias Diehard and Pierre L’Ecuyers
TestU01 both contain code to test PRNGs, later also contains a paper describing the results of these tests for most recent and popular PRNGs.

In an ideal world one would just have to look at the TestU01 paper And pick a generator that has the amount of defects and speed one wants. At least thats what i thought before testing a few generators myself, more precissely i took a few of the generators that passed all tests in TestU01 and run 2 (to me obvious) tests against them

Use a gaussian like algorithm to find out if any bits in the output are just a linear (mod 2) combination of previous output bits
Use a gaussian like algorithm to find out if any scalars in the output are just a linear (mod maxoutput+1) combination of previous output scalars
Use a gaussian like algorithm to find out if any bits (only considering the least significant of each scalar) in the output are just a linear (mod 2) combination of previous such output bits

The first 2 of these tests in the way i implemented them only consider outputs surrounding a scalar with 20 zero LSBs, though as far as i tested it this 20bit trick only made a difference for the additative LFG with only the 32MSB used, but i did not test all without this 20bit check

Failing either of these tests makes the PRNG at least unsuitable for linear algebra under the same modulo, so its certainly a statistical defect. All linear congruential generators have to fail at least the test that works in their own modulo, after all they are linear, but actually all i tested failed both tests. Lagged fibonacci generators based on addition or xor similarly must fail and do. All multiplicative Lagged fibonacci generators fail as well due to linear dependancies in their lower bits. All linear feedback shift register based generators like the mersene twister very obviously have to fail too, i did not test any of them though as none passed TestU01.

The actual generators that passed TestU01 and failed mine where:
superduper64, Marsa-lfib4, DX-47-3, MRGk5_93, Lfib(2^64,55,24,*), brent-xor4096s

The actual generators that passed TestU01 and passed mine where:
~~brent-xor4096s,~~ MRG31k3p, CombMRG96, ran2, CLCG4, KISS99

Note1, there are more generators that passed TestU01 but i did not test them

Note2, The first 2 tests where only searching for linear dependancies within 512 consecutive scalars, that is 32kbit for a 64bit PRNG, the 3rd test was considering 32kbit of LSBs

In addition to that, limiting the output to just 32MSB of 64 made the multiplicative LFG 2^64,55,24,* pass mine while the same trick with a additative one was not helping, KISS99 also passed when SHR3 or CONG was droped but not when both or MWC alone was droped

No source code this time though because i randomly copied PRNGs from various places, but if there is interrest i could throw all but my own code out and post that

To be continued when iam less tired … ;)

Filed under: Cryptanalysis,Pseudo random number generators — Michael @ 01:56

Comments (10)

2024-03-17

Some basic and known concepts about LCGs to better illustrate their Structure

Summary of what we currently know about the correlations

Offsets in the low half

Collisions in the first xor

Collisions in the 2nd xor

Some actual tests and software to quantify these specific correlations

A hypothetical example showing that this can be a real issue under the right/wrong circumstances

A workaround

A solution

Final words

More issues

2024-01-24

Correlations based on the lower half of the LCG

Correlations based on the upper half of the LCG

Correlations with equal seeds

Now python randomgen + numpy

An example of the upper part of the LCG in PCG64-DXSM being modified with identical seed in numpy/randomgen

2024-01-12

2024-01-08

2024-01-06

2024-01-04

2023-05-18

Can we exploit it ?

The original findings on the 3% anomaly

Periodic anomalies

The original findings on perodic anomalies

Shaken not Stirred

Random now?

Audacity spectra

Audacity spectra with more data and in signed 8bit ints

Dieharder testsuite

2008-11-30

Fastest PRNGs

Good and fast PRNGs

Best but slow PRNGs

Bad PRNGs

2008-11-22

Links