Lair Of The Multimedia Guru

2024-03-17

Correlations in randomly initialized PCG-DXSM

In the previous blog entry I showed many bad correlations in PCG-DXSM, but the cases had always at least 2 values non random either a seed and an increment or 2 increments. And even though one could create more than a septillion of these for each seed/increment. Thats not how PCG-DXSM should be used.

So the natural question is, can this happen if PCG-DXSM is used how its supposed to, that is initialized with strongly random seed+increment.

The short awnser is, given the right circumstances, yes.
The following text is structured like this:

  1. Some basic and known concepts about LCGs to better illustrate their structure
  2. Summary of what we currently know about the correlations
  3. Some actual tests and software to quantify these specific correlations
  4. A hypothetical example showing that this can be a real issue under the right circumstances
  5. A workaround
  6. A Solution
  7. Some final words
  8. More issues

Some basic and known concepts about LCGs to better illustrate their Structure

In the algebraic structure of LCGs with identical multipliers and operations mod 2m
LCG(v0, increment) can be seen as a vector whos elements are defined by vi = vi-1 * multiplier + increment (mod 2m) we can add 2 vectors and thus LCGs. LCG(seed1, increment1) + LCG(seed2, increment2) = LCG(seed2+seed1, increment2+increment1) from this we can immediately see that the associative and kommutative law holds and that there is a inverse and identity element.
Thus LCGs with identical multipliers form an abelian group. The identity element is LCG(0,0)i = 0. For each constant vector vi = C theres a corresponding LCG(C, C*(1 – multiplier)) the increment of such “bad” LCGs is even when the multiplier is odd. It is thus common to disallow even increments. None the less even when they are disallowed in an implementation, they do not disappear from the algebraic structure. Thats because if one allows only odd numbers
the difference between 2 odd numbers is still even. If we looked at LCGs as polynomials instead of vectors then we could furthermore introduce a multiplication by x mod x2^M-1 to represent a cyclic shift of all the elements of one vector. This would allow even more powerful tools to be used. But we don’t do this currently in this text. What we now have is basically a very small set of “primitive” LCGs that can then be offseted by a constant
LCG(C,I + C*(1 – multiplier)) = LCG(0,I) + LCG(C, C*(1 – multiplier)) the number of these primitive LCGs depend on the number of factors of 2 in 1 – multiplier. for PCG-DXSM we have 2 factors of 2 but odd increments are disallowed so we have 2 such primitive vectors which is all the diversity that exists in this LCG, all other vectors are just offseted by a constant from them. Looking at this you might ask if there are 2 LCG outputs that have a 128bit constant added how do these 2129 cases fit in a 127bit “odd” increment. They do because 4 live as cyclic shifts in each vector.

Summary of what we currently know about the correlations

In the previous section we have shown that we can offset the LCG output by a constant and that all LCG outputs basically are offseted from one of 2 sequences by a constant. Or if one considered the 4 cyclic shifts too then really only half would be unique, its a matter of viewpoint.

Offsets in the low half

In DXSM the LCG output is split in 2 64bit words the low one or-ed by 1 so its always odd and the high one passed through xorshift, multiply, xorshift
before being multiplied with the low.
If we look at 2 LCGs that are offseted by D, so that only the low half is offsetted but the high half stays the same. This can be represented (in C style notation) as:
(L[i] | 1)*H[i] vs. ((L[i] + D) | 1)*H[i]
both (L[i] | 1) and ((L[i] + D) | 1) are odd as they have explicitly an 1 or-ed in, this can be written also as
(2X[i] + 1) * H[i] vs. (2Y[i] + 1) * H[i]
their difference
(2X[i] + 1)*H[i] – (2Y[i] + 1)*H[i] = (2X[i] + 1 – 2Y[i] – 1)*H[i] = 2(X[i] – Y[i])*H[i]
as can be seen their difference is even thus their lowest bit is correlated, that is not to say higher bits are not
Here we now have established that instead of 2127 uncorrelated PCG-DXSM streams there are at best 264

Collisions in the first xor

The first xor mixes the high 32bit of the high half into the low 32bit of the high half, next the high half is multiplied by the constant multiplier and the high quarter xored into the low quarter before the final multiplication with the low half of the LCG. It is important to recognize that if, after the first XOR the low 49bits of the high side of the LCG match between 2 generators then we will have identical low bits after DXSM. Consider
MSB [15bit random][17bit 0][15bit random][17bit 0][64 bit random] LSB
The first xor stage will effectively do nothing to this. The correlation after that then depends on the low 15 random bits which is the only difference there is between the two LCGs which can affect the LSB from the high half. While the low half is orred by 1 so it is bascially a constant. If this on its own doesnt already fail statistical tests. Then this can be made more likely by observing that if we misalign 2 LCGs by a multiple of 2128-b the only affected output will be in the highest b bits. So we can introduce entropy in the highest bits at will to tweak the high quarter and bring the 2 into a bad correlated state. The high quarter itself is outside the 49bits affecting the LSB output other than by the xor. It is also observed that increasing the amount of random bits beyond 15 does not fix this issue. It just requires more effort to find correlations. For example using 19bits, 65536 streams, 1024 stream alignments and 4 billion samples per stream, we still find statistical anomalies.

Collisions in the 2nd xor

It is possible to introduce a delta in the low half of the high half of the LCG, this will be unaffected by the first xor and then multiplied by the multiplier, this operation is algebraically well understood and it is possible to introduce deltas here which tend to produce poor statistics after the 2nd XOR on the low side. Such a case was also demonstrated toward teh end of my last blog entry, but no attempt was made to exploit this in the software here

Some actual tests and software to quantify these specific correlations

Combining the 2 cases from above with 19bits instead of 15 we can take 16k random increment,seed pairs and find a pair of vectors that are correlated if they are aligned. I cannot show correlation directly on fully random values due to myself only having my home computer for this. Also I do take 64k instead so we have over a dozend pairs to work with. At this point we align one of the seeds to the other, then shift it by a random 2128-19 multiple and check the correlation for 1024 shifts. For each shift we generate a stream of 232 values. Each tested stream will have its least significant bits interleaved or xored and the occurrence of each even 2-18 bit pattern checked based against the binomial distribution. Both the xor and interleaving sometimes catch separate cases so either alone could be used for slightly fewer cases found. Also for XOR mainly 6 and 8 bits matter while for interleaving the larger ones catch things. In the test run we see a pair that showed correlations at 15 different alignments of the 1024 tested. Given these 1024 represent a random 1/512 of the searchspace we consider. We expect ~8000 in the whole space for that pair. Other pairs had similar numbers so theres no reason to believe this was a outlier. Combining all this there seems to be a maximum of ~29 bits of diversity in the increments and about 115 bits in the seed. We can also see that many of the found correlations have multiple independent patters that are failing tests
Source code is here

Stage 1, filling set with random increments
set size:65536, stream_length:4294967296, top bits:19
Stage 2, Sorting streams
Stage 3, Comparing streams
pairs to consider: 37
checking pair with DELTA: 0xE5B05FFFA33E1FFF ED6E8B7EA831BA19 si0: 0x4B1D7EBE754531A6 03D07FC29934A58B,0xEC1A098AE7F3EE1C F0C3F7E28050E655 si1: 0x656D1EBED20711A6 1661F443F102EB72,0xCFC64EE4226C353F 911E2DDFC6D557E9
clean table 32 18
fill table 32 18
clean table 32 16
fill table 32 16
clean table 32 14
fill table 32 14
clean table 32 12
fill table 32 12
clean table 32 10
fill table 32 10
clean table 32 8
fill table 32 8
clean table 32 6
fill table 32 6
clean table 32 4
fill table 32 4
clean table 32 2
fill table 32 2
 3B8:  DELTA: 0xE3DF5FFFA33E1FFF ED6E8B7EA831BA19 -6    6(  67044071)0.000000008
checking pair with DELTA: 0x2FC51FFF99D95FFF B54A758B34402CFA si0: 0x6AF7AD68373F1ECB 388F5B99D012EDC8,0x3E41C67901EE75A1 DDAC6EB90EFCDF53 si1: 0x3B328D689D65BECB 8344E60E9BD2C0CE,0xD540FAF60FD83D88 C4D70A7957646F1B
checking pair with DELTA: 0x069F600086C96000 0EB818571D7DCAD0 si0: 0x75FED5AD724E4D47 09756BA7A9BB3730,0x11E1922B5FBB1E8E 4F7311398EE8460F si1: 0x6F5F75ACEB84ED46 FABD53508C3D6C60,0x9AAEFF3A5B76A3D6 3EC2E876DEA2604F
checking pair with DELTA: 0xA6DBFFFFA9631FFF ECCA24CFB00F1167 si0: 0x642B9E142D73D47D 758CA1A36D0AED70,0xEA0B9F70E0FE5C84 59DF4DCDE2F4727F si1: 0xBD4F9E148410B47D 88C27CD3BCFBDC09,0x893A4A4B113A5A3C 205C596A9D7316EB
checking pair with DELTA: 0xA4107FFF2C3EC000 26C54851149E853A si0: 0xE76E9BF9B19A825A 6A986886330B2129,0x1090288CC2863F81 D609776DF7909853 si1: 0x435E1BFA855BC25A 43D320351E6C9BEF,0xCD67E2DED5A42A3C 1905C3FB9FE4351B
  AE:  DELTA: 0xEB3EBFFF2C3EC000 26C54851149E853A 12  310(   1057180)0.000000015  12  290(   1056853)0.000000211  12  f10(   1056627)0.000001252  12  1a0(   1040569)0.000001519  12  6e0(   1056277)0.000017950  12  690(   1041143)0.000113557  12  560(   1055904)0.000270395  12  da0(   1041278)0.000299401
 32E:  DELTA: 0x5DEEBFFF2C3EC000 26C54851149E853A 12  5c0(   1038624)0.000000000  12  1c0(   1057931)0.000000000  12  ec0(   1057333)0.000000004  12  ac0(   1040326)0.000000223  12  dc0(   1056653)0.000001023  12  6c0(   1041181)0.000149446  12  2c0(   1055845)0.000410364  12  9c0(   1041636)0.003602710
 2AE:  DELTA: 0x46FEBFFF2C3EC000 26C54851149E853A 12  4d0(   1060953)0.000000000  12  610(   1062490)0.000000000  12  760(   1060287)0.000000000  12  7a0(   1058928)0.000000000  12  8d0(   1063077)0.000000000  12  a10(   1062087)0.000000000  12  b60(   1062943)0.000000000  12  ba0(   1061416)0.000000000
 22E:  DELTA: 0x300EBFFF2C3EC000 26C54851149E853A 12  670(   1057881)0.000000000  12  740(   1039293)0.000000000  12  580(   1040335)0.000000239  -6    c(  67168834)0.000000783  12  570(   1056351)0.000010320  -6   2c(  67053163)0.000034817  12  f40(   1056076)0.000078670  12  d70(   1041141)0.000111924
checking pair with DELTA: 0x50007FFF5C726000 B744241DE30B69D1 si0: 0x6B787EBE72991EEA 8289751974A2A05F,0x0BBD42A86319A4D6 1545348A20D29269 si1: 0x1B77FEBF1626BEE9 CB4550FB9197368E,0xA41D07785110511F 469B72942EA5D15D
checking pair with DELTA: 0x4AC8DFFF76E85FFF 5317B9650AE16599 si0: 0x87FB5321F119AC7A C56465440DFD6BBB,0x1BC3CB3CF5D7B874 DACCF5D5C924B5AF si1: 0x3D3273227A314C7B 724CABDF031C0622,0x07EE1FA4549F8160 7A648121E3A1BD43
checking pair with DELTA: 0x0D78BFFFF0FCC000 9F272846B576856A si0: 0x114FA1E4F4E52E77 6CF38EB5E44CC81B,0x79DA1F7A3E715F59 889363DD36B59C95 si1: 0x03D6E1E503E86E76 CDCC666F2ED642B1,0x4998AE764CF9EA36 A31FDF6D2069DB1D
 2F9:  DELTA: 0xF300DFFFF0FCC000 9F272846B576856A 1834a80(     17652)0.000000000  16  540(     68674)0.000000000  16  a80(     68749)0.000000000  16 1740(     68320)0.000000000  16 1a40(     68301)0.000000000  16 2740(     68147)0.000000000  16 2a40(     68350)0.000000000  16 3540(     68046)0.000000000
 3F9:  DELTA: 0xA420DFFFF0FCC000 9F272846B576856A 14 2000(    266373)0.000000215  16 3000(     63697)0.002678040  14    0(    258663)0.011878566  16 8000(     63794)0.041411834  14 1000(    265545)0.041734621
checking pair with DELTA: 0xBBE320004FF76000 82454C2874A21EF3 si0: 0x7BBAE3DB04A0B4DE 5D7DAF8022BF2D34,0xB6D14DEA456CF025 AF3C8E5BC500FBDD si1: 0xBFD7C3DAB4A954DD DB386357AE1D0E41,0x2E36E70296EAF807 09177110A56946B9
 109:  DELTA: 0xA338C0004FF76000 82454C2874A21EF3 -6   24(  67168303)0.000001275  -6    4(  67058944)0.003937258
 309:  DELTA: 0x0478C0004FF76000 82454C2874A21EF3 16 1000(     68200)0.000000000  16 6000(     67912)0.000000000  16 9000(     68422)0.000000000  16 a000(     68296)0.000000000  16 e000(     68322)0.000000000  14 1000(    271534)0.000000000  14 2000(    272358)0.000000000  14    0(    247561)0.000000000
 189:  DELTA: 0x3B88C0004FF76000 82454C2874A21EF3 14 3523(    266414)0.000000110  14  523(    266255)0.000001424  14 17a3(    266247)0.000001616  14 3e23(    266166)0.000005727  14  52f(    266127)0.000010440  -8   e8(  16806671)0.000011400  14 34a3(    266115)0.000012545  14  223(    266025)0.000048894
 389:  DELTA: 0x9CC8C0004FF76000 82454C2874A21EF3 14 239c(    265963)0.000122652  -8   f8(  16749417)0.000200331  14  f9c(    265926)0.000210908  14 3cac(    265900)0.000307758  -8   38(  16803599)0.002137644  -8   78(  16751532)0.006391691  14   ac(    258625)0.007053808  14 1c6c(    265614)0.016667751
checking pair with DELTA: 0x61F81FFFD0A1A000 0EDB22A63E0AD996 si0: 0x1429D0EE8B94E764 E8B02AFD0694AD82,0x17D8D7C4708639FC F80DBC31B69114A5 si1: 0xB231B0EEBAF34764 D9D50856C889D3EC,0x79A6C972B0378F58 9787503C7F7BA21D
 12C:  DELTA: 0x41639FFFD0A1A000 0EDB22A63E0AD996 14  82c(    267011)0.000000000  14 1bdc(    257159)0.000000000  14 282c(    257273)0.000000000  14 275c(    257468)0.000000000  14 342c(    266806)0.000000000  14 182c(    257848)0.000000050  14 27dc(    257918)0.000000159  14  75c(    266276)0.000001021
  2C:  DELTA: 0xBCC39FFFD0A1A000 0EDB22A63E0AD996 14  65c(    257898)0.000000114  14 19ac(    266391)0.000000161  14 15ac(    266117)0.000012167  14 29ac(    265981)0.000094048  14 332c(    258649)0.009809622  14 365c(    258698)0.019104299  14 3a5c(    258709)0.022160153
 2EC:  DELTA: 0x697B9FFFD0A1A000 0EDB22A63E0AD996 14 202b(    265734)0.003239245
 32C:  DELTA: 0x4AA39FFFD0A1A000 0EDB22A63E0AD996 14 20ac(    266407)0.000000124  14 2fec(    258349)0.000135828  14 3e6c(    258354)0.000146280  14 10dc(    258396)0.000271643  14 30dc(    265864)0.000517196  14 2f2c(    258539)0.002125153  14  aec(    265729)0.003471746  14 3fec(    265693)0.005703141
 2AC:  DELTA: 0x88539FFFD0A1A000 0EDB22A63E0AD996 16 2c00(     67872)0.000000001  16 ec00(     67853)0.000000001  16 9c00(     63245)0.000000001  16 ac00(     63262)0.000000002  16 6c00(     63316)0.000000014  16 5c00(     63339)0.000000031  16 1c00(     67614)0.000003283  16 dc00(     67613)0.000003389
 22C:  DELTA: 0xC6039FFFD0A1A000 0EDB22A63E0AD996 14  56c(    266846)0.000000000  14  3ec(    266323)0.000000482  14  96c(    266299)0.000000708  14 105c(    258276)0.000045522  14 13dc(    258308)0.000073692  14 3a9c(    265995)0.000076435  14  c1c(    265955)0.000137963  14  a9c(    265801)0.001268171
checking pair with DELTA: 0xAEE04000493CBFFF BD570161F1BB9245 si0: 0x3232CB36E1EE4D2A A6E5C02AEC015417,0x5D4AADB8CE05CB16 60BFF633BE1922B7 si1: 0x83528B3698B18D2A E98EBEC8FA45C1D2,0x7023209E9838B9CB EDC7AD26E9D4B33B
  FA:  DELTA: 0x91C80000493CBFFF BD570161F1BB9245 16    0(     67945)0.000000000  14 1270(    267234)0.000000000  12   80(   1061964)0.000000000  12  1b0(   1059972)0.000000000  12  270(   1062824)0.000000000  12  34c(   1059060)0.000000000  12  40f(   1058356)0.000000000  12  540(   1062857)0.000000000
checking pair with DELTA: 0xF1602000180CDFFF DDB3EEC70873FC23 si0: 0xCD641FA73D4FFD4D C1BFB916CB540340,0xD46851900454DC91 BE71D8204442D5E5 si1: 0xDC03FFA725431D4D E40BCA4FC2E0071D,0xDD28B58BCA390DD0 200B20AFEAB32681
 149:  DELTA: 0x46ACC000180CDFFF DDB3EEC70873FC23 -8   40(  16748835)0.000073773  -8   48(  16804298)0.000682072  14 161c(    265780)0.001704525  12  e5c(   1041888)0.019305183
  49:  DELTA: 0xBF0CC000180CDFFF DDB3EEC70873FC23 12  1ac(   1058326)0.000000000  12  dac(   1059512)0.000000000  -6    0(  67180422)0.000000000  -6   20(  67037873)0.000000000  12  5ac(   1039647)0.000000001  12  e5c(   1056936)0.000000108  12  9ac(   1040522)0.000001052  12  25c(   1056402)0.000007026
 249:  DELTA: 0xCE4CC000180CDFFF DDB3EEC70873FC23 12  16c(   1039873)0.000000005  12  69c(   1056832)0.000000250  12  920(   1040498)0.000000872  12  750(   1040800)0.000008950  12  56c(   1055931)0.000223162  12  f50(   1055478)0.005111395  12  e9c(   1041846)0.014654417  12  350(   1055194)0.033027892
  C9:  DELTA: 0x82DCC000180CDFFF DDB3EEC70873FC23 16    0(     67311)0.024882147
 209:  DELTA: 0xEC64C000180CDFFF DDB3EEC70873FC23 14  edb(    258766)0.047457856
 3C9:  DELTA: 0x19BCC000180CDFFF DDB3EEC70873FC23 12  e7c(   1059061)0.000000000  12  a7c(   1039130)0.000000000  12  98c(   1039273)0.000000000  12  58c(   1039733)0.000000002  12  880(   1057062)0.000000039  12  c80(   1040191)0.000000075  12  27c(   1056979)0.000000077  12  18c(   1056842)0.000000230
 1C9:  DELTA: 0x0A7CC000180CDFFF DDB3EEC70873FC23 12   b0(   1058704)0.000000000  12  870(   1058321)0.000000000  12  cb0(   1058044)0.000000000  12   70(   1039197)0.000000000  12  380(   1057886)0.000000000  12  3b0(   1039348)0.000000000  12  470(   1057815)0.000000000  12  770(   1039485)0.000000000
checking pair with DELTA: 0x4BAEDFFF3117BFFF 2C6645F93365ECEA si0: 0xE40A9D2B0465523F 42BC3B3B97D36E8D,0xBB93AB3A34C78A9A 64593536E5C52775 si1: 0x985BBD2BD34D9240 1655F542646D81A3,0x3AB79415CB425F3D 57648C6C00E22BFD
 194:  DELTA: 0x51A95FFF3117BFFF 2C6645F93365ECEA 12  d39(   1041947)0.028354029
checking pair with DELTA: 0x98E3E00009FCA000 0A0A41967884F5D6 si0: 0xDF60BAED6D489376 8C513233A6ED5E42,0x3955813EEBEECC4A 641A81300E22E3A5 si1: 0x467CDAED634BF376 8246F09D2E68686C,0x5ACFED56DDFFBB1B A8D2EF5151DF4E1D
  FF:  DELTA: 0xBD3F400009FCA000 0A0A41967884F5D6 12  507(   1041478)0.001219788
 11F:  DELTA: 0xC5D3400009FCA000 0A0A41967884F5D6 12  5fe(   1041919)0.023635555
 31F:  DELTA: 0x4F13400009FCA000 0A0A41967884F5D6 -6   33(  67163521)0.000085788
  9F:  DELTA: 0xA383400009FCA000 0A0A41967884F5D6 12  eee(   1041253)0.000250523  12  d1e(   1055609)0.002107609  12  dee(   1041717)0.006220077
 37F:  DELTA: 0x68CF400009FCA000 0A0A41967884F5D6 12   db(   1058017)0.000000000  12   e7(   1058942)0.000000000  12  7db(   1059479)0.000000000  12  827(   1059615)0.000000000  12  ce7(   1058144)0.000000000  12  f27(   1058944)0.000000000  -8   66(  16826582)0.000000000  -8   e6(  16822338)0.000000000
 27F:  DELTA: 0x242F400009FCA000 0A0A41967884F5D6 14 34a7(    267131)0.000000000  12   5b(   1064451)0.000000000  12   bb(   1060051)0.000000000  12  137(   1064641)0.000000000  12  14b(   1059721)0.000000000  12  1cb(   1063913)0.000000000  12  227(   1058434)0.000000000  12  237(   1062931)0.000000000
  7F:  DELTA: 0x9AEF400009FCA000 0A0A41967884F5D6 -6   12(  67240251)0.000000000  -6   32(  66978036)0.000000000  -8   52(  16812266)0.000000000  -8   b2(  16742497)0.000000000  12  997(   1057305)0.000000005  -8   72(  16743940)0.000000008  -8   12(  16809826)0.000000030  12  d6b(   1040123)0.000000043
 29F:  DELTA: 0x2CC3400009FCA000 0A0A41967884F5D6 12  61e(   1041497)0.001391186  12  729(   1055557)0.003001539  12  91e(   1041758)0.008181266  12  f29(   1041977)0.034431422
 2FF:  DELTA: 0x467F400009FCA000 0A0A41967884F5D6 -6    a(  67162084)0.000284374  12  c17(   1041969)0.032696417
  3F:  DELTA: 0x89C7400009FCA000 0A0A41967884F5D6 -6   10(  67060938)0.017930280
 21F:  DELTA: 0x0A73400009FCA000 0A0A41967884F5D6 12  4c1(   1055397)0.008769471
 1FF:  DELTA: 0x01DF400009FCA000 0A0A41967884F5D6 -6   12(  67158377)0.005435176  12  ceb(   1055462)0.005689278
  5F:  DELTA: 0x925B400009FCA000 0A0A41967884F5D6 12  8c6(   1055190)0.033889247
 3FF:  DELTA: 0x8B1F400009FCA000 0A0A41967884F5D6 -6    6(  67166263)0.000008006  -6   26(  67061803)0.033983055
 17F:  DELTA: 0xDF8F400009FCA000 0A0A41967884F5D6 12   17(   1059398)0.000000000  12   c7(   1058635)0.000000000  12   f7(   1058936)0.000000000  12  15b(   1059633)0.000000000  12  1a7(   1061491)0.000000000  12  25b(   1061656)0.000000000  12  2a7(   1061160)0.000000000  12  307(   1058587)0.000000000
checking pair with DELTA: 0x9C385FFFD517C000 306C5C1B4D5AB7C6 si0: 0x3FEAE99F258B5DE7 09F745C8569D55EC,0x7F3878FC16BE1A2D 2DD9E922BA431F8B si1: 0xA3B2899F50739DE6 D98AE9AD09429E26,0xAAC4D1CD87D69C2E 70FEC2F54B2666C3
 2D6:  DELTA: 0x2D609FFFD517C000 306C5C1B4D5AB7C6 14 3eb0(    266018)0.000054283
checking pair with DELTA: 0x886D80002EBC1FFF EBA31452485D63B9 si0: 0xBF85BAB7F7A88915 DD24DFCE63656BDD,0x1D5F2A9E89966A03 3142879BE63649DD si1: 0x37183AB7C8EC6915 F181CB7C1B080824,0xC92276BC926F7002 47E39578BCDCFFF1
checking pair with DELTA: 0x3043000012E7A000 4F55B89E15612A9D si0: 0xB52CE7856DE4DCAE 576D37344D0DE947,0x9055C7B98826164F 4FAAEB6EB2A3DC9F si1: 0x84E9E7855AFD3CAE 08177E9637ACBEAA,0xA6DE91805DB5FD1D B80EAD3EBF24CB03
checking pair with DELTA: 0xDB780000301CDFFF EFC16152F51E792F si0: 0x63684BE2D0185347 9CCA1B957F53D05A,0x80FA168A7F074369 F0A5BC2519C96861 si1: 0x87F04BE29FFB7347 AD08BA428A35572B,0x5B97EE447686113E FB2C8C166671C56D
  8B:  DELTA: 0x37BD6000301CDFFF EFC16152F51E792F 14 3f77(    257778)0.000000015  14 1f77(    266226)0.000002249  -8   f6(  16747303)0.000004836  14   8b(    258467)0.000761714  14  e4b(    258533)0.001952468  14 208b(    265599)0.020378990  -8   b6(  16801849)0.032928680  14 1e4b(    265538)0.045762519
 1CB:  DELTA: 0x1AE56000301CDFFF EFC16152F51E792F -8   90(  16809736)0.000000035  -8   d0(  16749917)0.000465123  -8   10(  16803470)0.002631082
 28B:  DELTA: 0x6FFD6000301CDFFF EFC16152F51E792F -8   3a(  16812939)0.000000000  -8   fa(  16744839)0.000000045  -8   7a(  16749044)0.000105850  -8   92(  16749408)0.000197290  -8   12(  16750130)0.000662933  -8   ba(  16802080)0.023187546  -8   52(  16802065)0.023723934
 3CB:  DELTA: 0x53256000301CDFFF EFC16152F51E792F 14  1ec(    266870)0.000000000  14  bec(    267106)0.000000000  14  e1c(    267279)0.000000000  14 149c(    266995)0.000000000  14 18dc(    267103)0.000000000  14 249c(    267132)0.000000000  14 3e1c(    267193)0.000000000  14  49c(    257280)0.000000000
 38B:  DELTA: 0x8C1D6000301CDFFF EFC16152F51E792F 14 3f07(    265916)0.000243974  14 1c0b(    265914)0.000251174  14  c0b(    258412)0.000343271  14 38b7(    265749)0.002629606  14 20fb(    258612)0.005894475  14 23f7(    265565)0.032043522  14 30fb(    265540)0.044574402
 18B:  DELTA: 0x53DD6000301CDFFF EFC16152F51E792F 14 3c67(    266192)0.000003825  14  39b(    266165)0.000005816  -8   d2(  16747988)0.000016635  14 2f9b(    266035)0.000042097  14 3067(    258273)0.000043503  14 2c67(    258349)0.000135828  14  7d7(    265938)0.000177004  -8   92(  16804377)0.000598385
 24B:  DELTA: 0xA8F56000301CDFFF EFC16152F51E792F 14 1480(    265808)0.001148715
checking pair with DELTA: 0xAC575FFFEABDA000 4916E783B7945A0F si0: 0xEA55F2E4610D825A C9C89FD187E1A389,0xA809E3872B037628 4612D5C116F50F8F si1: 0x3DFE92E4764FE25A 80B1B84DD04D497A,0x88A56FF680022BCF 6C25989E452C8A1B
   B:  DELTA: 0x363E7FFFEABDA000 4916E783B7945A0F -6   12(  67172158)0.000000033

real	519m55,784s
user	13633m42,794s
sys	8m12,787s
Stage 1, filling set with random increments
set size:16384, stream_length:4294967296, top bits:19
Stage 2, Sorting streams
Stage 3, Comparing streams
pairs to consider: 5
checking pair with DELTA: 0xBF6CFFFFE20A6000 1237BFB750ABC5D4 si0: 0x7B36DD275880DB17 CADB2D976848C718,0xF82B95907E57998B 00CC0B074806B4AB si1: 0xBBC9DD2776767B17 B8A36DE0179D0144,0x9B02AF6C89EFC1D7 EA92FCC9A4D2ADBB
clean table 32 18
fill table 32 18 to 4220518
clean table 32 16
fill table 32 16 to 4299161
clean table 32 14
fill table 32 14 to 4613734
clean table 32 12
fill table 32 12 to 5872025
clean table 32 10
fill table 32 10 to 10905190
clean table 32 8
fill table 32 8 to 31037849
clean table 32 6
fill table 32 6 to 111568486
clean table 32 4
fill table 32 4 to 433691033
clean table 32 2
fill table 32 2 to 1722181222
 10E:  DELTA: 0x3D063FFFE20A6000 1237BFB750ABC5D4 12  230(   1058325)0.000000000  12  530(   1057565)0.000000000  12  1f0(   1057502)0.000000000  12  130(   1040174)0.000000009  12  2f0(   1040192)0.000000010  12  930(   1056696)0.000000099  12  af0(   1056646)0.000000146  12  a30(   1040910)0.000002765
checking pair with DELTA: 0x80B2C00052319FFF 85F7EBD9A5CACFEE si0: 0xFE11F43F71412700 F6B7489194A2D4B7,0xE2AF29ED1B20D157 01C834D59668D06F si1: 0x7D5F343F1F0F8701 70BF5CB7EED804C9,0xD8493089E0552D49 D998F77D58F2D3C7
 1F4:  DELTA: 0x6FCE400052319FFF 85F7EBD9A5CACFEE -6    c(  67043292)0.000000000  -6   2c(  67161936)0.000043402
 3F4:  DELTA: 0x608E400052319FFF 85F7EBD9A5CACFEE 12   70(   1042089)0.009535608  12  f70(   1055012)0.014186071  12  3b0(   1042190)0.018029377
 2F4:  DELTA: 0xE82E400052319FFF 85F7EBD9A5CACFEE -6    4(  67198002)0.000000000  -6   24(  67032108)0.000000000  12  ec0(   1040797)0.000001182  -8   44(  16801938)0.003888811  -8   c4(  16801805)0.004755570  12  d0c(   1041987)0.004963123  12  6c0(   1054956)0.020123604  12  ac0(   1054907)0.027260483
  F4:  DELTA: 0xF76E400052319FFF 85F7EBD9A5CACFEE 12  800(   1056020)0.000015939
checking pair with DELTA: 0xB1E3C0007875E000 2C73576CA4DE67DE si0: 0xCB1F2D8DA9A5ACB8 079DA5BC62AB8EF8,0xF136A63903CFB057 99427D8A8F043A5D si1: 0x193B6D8D312FCCB7 DB2A4E4FBDCD271A,0x009C79F3B34D2CD3 22B43E5851BF9275
 1B4:  DELTA: 0x6B7C40007875E000 2C73576CA4DE67DE 14 2b00(    257580)0.000000000  14 1b00(    258140)0.000000760  14 3b00(    265893)0.000046023  14 1700(    258412)0.000046388  14  b00(    265785)0.000214713  14  700(    265705)0.000653519  14 3700(    265657)0.001259998  14 2700(    258731)0.004023711
checking pair with DELTA: 0x45728000B24DC000 1AFDFE5D4232E832 si0: 0x3456181E0244AE3D 9CFEF32202AB3CE3,0xC8CB61ECFAF163DF C41D4E73484C4E1D si1: 0xEEE3981D4FF6EE3D 8200F4C4C07854B1,0xDD8334AD194ECC83 F2A9D06A4F12C145
  46:  DELTA: 0x07CF4000B24DC000 1AFDFE5D4232E832 -6    c(  67065092)0.047214990
 146:  DELTA: 0xD5EF4000B24DC000 1AFDFE5D4232E832 14  700(    267561)0.000000000  14  b00(    267517)0.000000000  14 3700(    268055)0.000000000  14 3b00(    267320)0.000000000  12  700(   1065199)0.000000000  12  b00(   1065864)0.000000000  12  f00(   1031945)0.000000000  12  300(   1035611)0.000000000
 346:  DELTA: 0x722F4000B24DC000 1AFDFE5D4232E832 -6    4(  67216858)0.000000000  -6   24(  67010410)0.000000000  12  1c0(   1038151)0.000000000  12  2c0(   1038787)0.000000000  12  6c0(   1057954)0.000000000  12  5c0(   1057500)0.000000000  12  9c0(   1057053)0.000000006  12  ec0(   1040158)0.000000008

real	118m52,649s
user	2853m41,628s
sys	1m57,080s

A hypothetical example showing that this can be a real issue under the right/wrong circumstances

For research (in physics for example) 5 sigma is a requirement and that is for the experiment not for the PRNG so it would be unreasonable if the PRNG would consume more than 1% of that. If we start from where we left of, we have 115 bits in the seed to begin with, minus 32 as we will consume 2³² from each stream and minus 30 for the ~1% at 5 sigma. That leaves us with 53 bits in the seed, we now add 29 bits from the increment leaving us with 82 bits. These 82 bit would be expected to be affected by a collision if more than half are used (like in a birthday attack). That leaves us with being able to consume 241 streams assuming they are perfectly randomly initialized. This compares to an ideal 2×128 bit seeded PRNG where we could consume 297 with the same 5 sigma and 232 read constraints. How long would a supercomputer need to hit a correlated pair ? If we assume PCG-DXSM needs 10 CPU operations per output and we read 232 per stream as in our code and a 1000 0000 Terra(fl)op/sec supercomputer is used. It would take about a day. 241 * 232 * 10 / ( 1000 0000 * 10004 sec-1 * 3600 sec/h * 24 h/day) = 1.09 … days The reader is here also reminded that we have not attempted to exploit the 2nd XOR in DXSM. Nor that the statistical tests used are very sophisticated. And that we are reading here 232 values for each stream. A system favoring reading more on seemingly correlated sequences and less on others would need less data to run into problems

A workaround

The obvious one, is not to use PCG-DXSM with (increment based) streams. But if continued use of PCG-DXSM with streams is a goal, and if we assume the described issue here represents all the correlated streams (which certainly is not true as we ignored the 2nd XOR and likely have not found all correlation cases beyond that either and also only considered reasonable aligned streams, theres also other little details here and there I knowingly did not pay adequate attention to). Then one can construct increments instead of randomly in a systematic way so they always differ in bits that have been required to be identical in a collision. That can be achieved easily by using a counter fed through some bijective function for some extra pseudo randomness. The remaining bits could then still be set somewhat randomly. With that and 21 bits instead of 19 bits as in the attack removed (I have also found a collision with 2x21bits randomized) we have 127 – 63 – 2×21 = 22 bits left and thus ~4 million increments, given there are 2 odd values modulo 4 there are ~8 million non correlated streams of full 2128 period (if we limit our self to what the implementation here can find) thus 2128+23. To clarify the differing bits we talk about are in the delta between aligned LCG outputs and not the increment values directly.

A solution

To solve this, first we need to understand why PCG has repeatedly been plagued by stream correlation issues, both the PCG “1.0” and now PCG 2.0 (DXSM). The reason appears to be because when using the increment and the seed and setting them randomly one inevitably challenges the mixer after the LCG with every constant difference. That is fundamentally very similar to feeding a mixer with a counter instead of a LCG. But the mixers seem designed to require more randomness than a counter provides.
The solution to this seems to be to give up on one of the assumptions leading to this, and these are 1. LCG, 2. random use of the increment, 3. a “simple” mixer.
Now if the simple mixer is replaced by a complex one than the LCG can be replaced by a counter more like in XSM64 because if the mixer is strong enough to work with a counter as input then the extra multiplication of the LCG no longer is useful and just slows it down.
I also do believe the solution is to spend more time analyzing the existing PRNGs instead of creating new ones. There are PRNGs that are fast and that have no known issues and seem to have received limited attention in terms of analyzing them. IMHO a new generator should only be created when its more optimal in some way than all other existing generators.

Final words

Iam curious if XSM64 has such issues too. From a quick glance it seemed designed with this sort of “correlation from offset” problem in mind, or at least i was too dumb to immediately produce a correlation the same way it works here. Sadly i did not have enough time to really investigate XSM64 so thats not meant as a recommendation, just a statement of myself being curious about …
I hope this whole blog entry is helpful to someone

More issues

After writing above, it came to my mind that as i already have the code to test for correlations between streams, it would be very easy to test this within the same period / LCG / stream / increment. After all we see correlations between streams that are misaligned by multiplies of 2109 with different increments. So what about within teh same stream/increment? Testing streams of 226 length and misaligned by all the 65536 possibilities of the top 16 bits of the 128 bit. The 3 pairs tested found 13409, 15950 and 15360 correlations. I also tried a random subset of the top 32bit instead of 16 but these found no additional correlations with my quick tests. Retesting this with practrand to be more sure my tests are not all faulty. And practrand seems better in detecting these.

    STATET inc1 = 12345;
    STATET inc2 = inc1;
    STATET state1 = 31415;
    STATET state2 = step(state1, inc1, ((STATET)1)<<109);

    int64_t stream_length = 1LL<<60;
    for(int64_t i = 0; i< stream_length; i++) {
        OUTT v1, v2;
        v1 = pcg_dxsm(&state1, multiplier, inc1);
        v2 = pcg_dxsm(&state2, multiplier, inc2);

        v1 ^= v2;
        fwrite(&v1, 8, 1, stdout);
    }
step 1<<109
RNG_test using PractRand version 0.95
RNG = RNG_stdin64, seed = unknown
test set = core, folding = standard (64 bit)

rng=RNG_stdin64, seed=unknown
length= 128 megabytes (2^27 bytes), time= 3.6 seconds
  no anomalies in 196 test result(s)

rng=RNG_stdin64, seed=unknown
length= 256 megabytes (2^28 bytes), time= 8.4 seconds
  no anomalies in 213 test result(s)

rng=RNG_stdin64, seed=unknown
length= 512 megabytes (2^29 bytes), time= 16.3 seconds
  no anomalies in 229 test result(s)

rng=RNG_stdin64, seed=unknown
length= 1 gigabyte (2^30 bytes), time= 29.9 seconds
  no anomalies in 246 test result(s)

rng=RNG_stdin64, seed=unknown
length= 2 gigabytes (2^31 bytes), time= 52.3 seconds
  Test Name                         Raw       Processed     Evaluation
  FPF-14+6/16:cross                 R=  +4.9  p =  1.8e-4   unusual
  ...and 262 test result(s) without anomalies

rng=RNG_stdin64, seed=unknown
length= 4 gigabytes (2^32 bytes), time= 94.4 seconds
  no anomalies in 279 test result(s)

rng=RNG_stdin64, seed=unknown
length= 8 gigabytes (2^33 bytes), time= 179 seconds
  no anomalies in 295 test result(s)

rng=RNG_stdin64, seed=unknown
length= 16 gigabytes (2^34 bytes), time= 342 seconds
  no anomalies in 311 test result(s)

rng=RNG_stdin64, seed=unknown
length= 32 gigabytes (2^35 bytes), time= 663 seconds
  Test Name                         Raw       Processed     Evaluation
  [Low16/64]FPF-14+6/16:all         R=  -5.9  p =1-2.3e-5   mildly suspicious
  ...and 324 test result(s) without anomalies

rng=RNG_stdin64, seed=unknown
length= 64 gigabytes (2^36 bytes), time= 1298 seconds
  Test Name                         Raw       Processed     Evaluation
  FPF-14+6/16:(15,14-3)             R=  +8.7  p =  2.3e-7   mildly suspicious
  FPF-14+6/16:all                   R=  +5.6  p =  1.0e-4   unusual
  TMFn(2+6):wl                      R= +25.9  p~=   1e-8    very suspicious
  [Low16/64]TMFn(2+4):wl            R= +28.2  p~=   7e-10   very suspicious
  [Low4/64]TMFn(2+2):wl             R= +25.1  p~=   3e-8    suspicious
  ...and 335 test result(s) without anomalies

rng=RNG_stdin64, seed=unknown
length= 128 gigabytes (2^37 bytes), time= 2268 seconds
  Test Name                         Raw       Processed     Evaluation
  FPF-14+6/16:(15,14-2)             R= +15.2  p =  4.3e-13   VERY SUSPICIOUS 
  FPF-14+6/16:all                   R= +10.3  p =  3.9e-9    VERY SUSPICIOUS 
  TMFn(2+6):wl                      R= +58.1  p~=   7e-28     FAIL !!        
  TMFn(2+7):wl                      R= +35.3  p~=   3e-14     FAIL           
  TMFn(2+8):wl                      R= +29.8  p~=   7e-11    VERY SUSPICIOUS 
  [Low16/64]FPF-14+6/16:cross       R=  +5.8  p =  3.2e-5   mildly suspicious
  [Low16/64]TMFn(2+4):wl            R= +58.1  p~=   6e-28     FAIL !!        
  [Low16/64]TMFn(2+5):wl            R= +36.2  p~=   1e-14     FAIL           
  [Low16/64]TMFn(2+6):wl            R= +31.1  p~=   1e-11    VERY SUSPICIOUS 
  [Low16/64]TMFn(2+7):wl            R= +19.3  p~=   1e-5    unusual          
  [Low4/64]TMFn(2+2):wl             R= +59.6  p~=   9e-29     FAIL !!        
  [Low4/64]TMFn(2+3):wl             R= +34.9  p~=   6e-14     FAIL           
  [Low4/64]TMFn(2+4):wl             R= +31.0  p~=   1e-11    VERY SUSPICIOUS 
  ...and 342 test result(s) without anomalies


step 1<<110
RNG_test using PractRand version 0.95
RNG = RNG_stdin64, seed = unknown
test set = core, folding = standard (64 bit)

rng=RNG_stdin64, seed=unknown
length= 128 megabytes (2^27 bytes), time= 2.4 seconds
  no anomalies in 196 test result(s)

rng=RNG_stdin64, seed=unknown
length= 256 megabytes (2^28 bytes), time= 6.5 seconds
  no anomalies in 213 test result(s)

rng=RNG_stdin64, seed=unknown
length= 512 megabytes (2^29 bytes), time= 14.6 seconds
  no anomalies in 229 test result(s)

rng=RNG_stdin64, seed=unknown
length= 1 gigabyte (2^30 bytes), time= 26.1 seconds
  no anomalies in 246 test result(s)

rng=RNG_stdin64, seed=unknown
length= 2 gigabytes (2^31 bytes), time= 49.2 seconds
  no anomalies in 263 test result(s)

rng=RNG_stdin64, seed=unknown
length= 4 gigabytes (2^32 bytes), time= 94.8 seconds
  no anomalies in 279 test result(s)

rng=RNG_stdin64, seed=unknown
length= 8 gigabytes (2^33 bytes), time= 180 seconds
  Test Name                         Raw       Processed     Evaluation
  FPF-14+6/16:(16,14-6)             R=  +7.8  p =  5.1e-6   unusual
  ...and 294 test result(s) without anomalies

rng=RNG_stdin64, seed=unknown
length= 16 gigabytes (2^34 bytes), time= 325 seconds
  Test Name                         Raw       Processed     Evaluation
  FPF-14+6/16:(16,14-5)             R= +12.5  p =  3.4e-10  very suspicious
  [Low16/64]FPF-14+6/16:all         R=  +4.9  p =  4.5e-4   unusual
  ...and 309 test result(s) without anomalies

rng=RNG_stdin64, seed=unknown
length= 32 gigabytes (2^35 bytes), time= 599 seconds
  Test Name                         Raw       Processed     Evaluation
  FPF-14+6/16:(16,14-5)             R= +25.2  p =  8.7e-21    FAIL !
  ...and 324 test result(s) without anomalies


step 1<<111
RNG_test using PractRand version 0.95
RNG = RNG_stdin64, seed = unknown
test set = core, folding = standard (64 bit)

rng=RNG_stdin64, seed=unknown
length= 128 megabytes (2^27 bytes), time= 2.6 seconds
  no anomalies in 196 test result(s)

rng=RNG_stdin64, seed=unknown
length= 256 megabytes (2^28 bytes), time= 7.7 seconds
  no anomalies in 213 test result(s)

rng=RNG_stdin64, seed=unknown
length= 512 megabytes (2^29 bytes), time= 15.1 seconds
  no anomalies in 229 test result(s)

rng=RNG_stdin64, seed=unknown
length= 1 gigabyte (2^30 bytes), time= 28.4 seconds
  no anomalies in 246 test result(s)

rng=RNG_stdin64, seed=unknown
length= 2 gigabytes (2^31 bytes), time= 52.3 seconds
  no anomalies in 263 test result(s)

rng=RNG_stdin64, seed=unknown
length= 4 gigabytes (2^32 bytes), time= 92.2 seconds
  no anomalies in 279 test result(s)

rng=RNG_stdin64, seed=unknown
length= 8 gigabytes (2^33 bytes), time= 172 seconds
  Test Name                         Raw       Processed     Evaluation
  FPF-14+6/16:(17,14-7)             R=  +9.1  p =  4.5e-7   mildly suspicious
  [Low16/64]FPF-14+6/16:cross       R=  +7.8  p =  6.4e-7   suspicious
  ...and 293 test result(s) without anomalies

rng=RNG_stdin64, seed=unknown
length= 16 gigabytes (2^34 bytes), time= 337 seconds
  Test Name                         Raw       Processed     Evaluation
  FPF-14+6/16:(17,14-6)             R= +16.4  p =  1.4e-12   VERY SUSPICIOUS
  FPF-14+6/16:(20,14-8)             R= +16.1  p =  1.1e-11   VERY SUSPICIOUS
  FPF-14+6/16:cross                 R=  +6.5  p =  8.5e-6   mildly suspicious
  [Low16/64]FPF-14+6/16:cross       R= +18.2  p =  1.2e-15    FAIL !
  ...and 307 test result(s) without anomalies


step 1<<112
RNG_test using PractRand version 0.95
RNG = RNG_stdin64, seed = unknown
test set = core, folding = standard (64 bit)

rng=RNG_stdin64, seed=unknown
length= 128 megabytes (2^27 bytes), time= 2.8 seconds
  Test Name                         Raw       Processed     Evaluation
  FPF-14+6/16:cross                 R=  +7.9  p =  5.5e-7   suspicious
  ...and 195 test result(s) without anomalies

rng=RNG_stdin64, seed=unknown
length= 256 megabytes (2^28 bytes), time= 7.9 seconds
  Test Name                         Raw       Processed     Evaluation
  FPF-14+6/16:cross                 R= +16.0  p =  7.3e-14    FAIL
  ...and 212 test result(s) without anomalies


step 1<<115
RNG_test using PractRand version 0.95
RNG = RNG_stdin64, seed = unknown
test set = core, folding = standard (64 bit)

rng=RNG_stdin64, seed=unknown
length= 128 megabytes (2^27 bytes), time= 2.9 seconds
  Test Name                         Raw       Processed     Evaluation
  FPF-14+6/16:cross                 R= +67.0  p =  9.4e-57    FAIL !!!!
  TMFn(2+0):wl                      R= +23.5  p~=   1e-7    suspicious
  [Low16/64]FPF-14+6/16:(4,14-3)    R= +13.9  p =  6.3e-12   VERY SUSPICIOUS
  [Low16/64]FPF-14+6/16:(5,14-4)    R=  +8.1  p =  1.4e-6   unusual
  [Low16/64]FPF-14+6/16:all         R=  +6.4  p =  1.5e-5   mildly suspicious
  [Low16/64]FPF-14+6/16:cross       R=  +6.5  p =  1.3e-5   mildly suspicious
  ...and 190 test result(s) without anomalies


step 1<<120
RNG_test using PractRand version 0.95
RNG = RNG_stdin64, seed = unknown
test set = core, folding = standard (64 bit)

rng=RNG_stdin64, seed=unknown
length= 128 megabytes (2^27 bytes), time= 2.6 seconds
  Test Name                         Raw       Processed     Evaluation
  BCFN(2+13,13-9,T)                 R= +26.0  p =  5.7e-7   mildly suspicious
  Gap-16:A                          R=  +6.9  p =  2.5e-5   mildly suspicious
  FPF-14+6/16:(0,14-0)              R=  +7.5  p =  1.6e-6   unusual
  FPF-14+6/16:(1,14-0)              R= +18.0  p =  2.8e-16    FAIL
  FPF-14+6/16:(2,14-0)              R= +16.0  p =  1.8e-14    FAIL
  FPF-14+6/16:(3,14-1)              R= +18.7  p =  2.4e-16    FAIL
  FPF-14+6/16:(4,14-2)              R=  +7.5  p =  2.2e-6   unusual
  FPF-14+6/16:(5,14-2)              R= +10.7  p =  3.7e-9   very suspicious
  FPF-14+6/16:(6,14-3)              R= +10.7  p =  3.8e-9   very suspicious
  FPF-14+6/16:(7,14-4)              R= +25.3  p =  1.4e-20    FAIL !
  FPF-14+6/16:all                   R= +94.7  p =  2.4e-88    FAIL !!!!!
  FPF-14+6/16:cross                 R= +2465  p =  3e-2074    FAIL !!!!!!!!
  TMFn(2+0):wl                      R=+968.3  p~=   6e-576    FAIL !!!!!!!
  TMFn(2+1):wl                      R=+487.2  p~=   2e-286    FAIL !!!!!!
  TMFn(2+2):wl                      R=+232.7  p~=   5e-133    FAIL !!!!!
  [Low16/64]BCFN(2+1,13-4,T)        R= +10.9  p =  1.0e-4   unusual


step 1<<127
RNG_test using PractRand version 0.95
RNG = RNG_stdin64, seed = unknown
test set = core, folding = standard (64 bit)

rng=RNG_stdin64, seed=unknown
length= 128 megabytes (2^27 bytes), time= 2.5 seconds
  Test Name                         Raw       Processed     Evaluation
  BCFN(2+0,13-3,T)                  R=+24986  p = 0           FAIL !!!!!!!!
  BCFN(2+1,13-3,T)                  R=+24275  p = 0           FAIL !!!!!!!!
  BCFN(2+2,13-3,T)                  R=+24133  p = 0           FAIL !!!!!!!!
  BCFN(2+3,13-3,T)                  R=+23873  p = 0           FAIL !!!!!!!!
  BCFN(2+4,13-4,T)                  R=+30456  p = 0           FAIL !!!!!!!!
  BCFN(2+5,13-5,T)                  R=+37691  p = 0           FAIL !!!!!!!!
  BCFN(2+6,13-5,T)                  R=+36654  p = 0           FAIL !!!!!!!!
  BCFN(2+7,13-6,T)                  R=+43148  p = 0           FAIL !!!!!!!!
  BCFN(2+8,13-6,T)                  R=+39020  p = 0           FAIL !!!!!!!!
  BCFN(2+9,13-7,T)                  R=+39135  p = 0           FAIL !!!!!!!!
  BCFN(2+10,13-8,T)                 R=+33354  p =  8e-8466    FAIL !!!!!!!!
  BCFN(2+11,13-8,T)                 R=+19688  p =  1e-4997    FAIL !!!!!!!!
  BCFN(2+12,13-9,T)                 R=+11650  p =  7e-2619    FAIL !!!!!!!!
  BCFN(2+13,13-9,T)                 R= +5830  p =  3e-1311    FAIL !!!!!!!!
  DC6-9x1Bytes-1                    R= +2564  p =  1e-1574    FAIL !!!!!!!!
  Gap-16:A                          R=+362.0  p =  3.5e-308   FAIL !!!!!!
  Gap-16:B                          R=+576.2  p =  2.4e-461   FAIL !!!!!!!
  FPF-14+6/16:(2,14-0)              R=+663.1  p =  2.7e-613   FAIL !!!!!!!
...
Filed under: Cryptanalysis,Pseudo random number generators — Michael @ 14:27

2024-01-24

How correlated is pcg-dxsm output?

Correlations based on the lower half of the LCG

Lets try picking some correlated combinations with (using my very simple pcg-dxsm implementation)

Multiplier 15750249268501108917 15750249268501108917 15750249268501108917
Increment 123456789 15750249268624565705 31500498537125674621
Seed 31415 31414 31413
Output 1 0x563B0DB76F45EEC2 0xDEE2CC4990CDBE1E 0xDEE2CC4990CDBE1E
Output 2 0x064C612B5E0A4853 0x064C612B5E0A4853 0x704228ECB09FC1D5
Output 3 0x4727428C93C1A285 0x5A7E3EA39E71C29F 0x5A7E3EA39E71C29F
Output 4 0x2745482D3BDBC1C0 0x2745482D3BDBC1C0 0x15D15980AE293340
Output 5 0x6055DF5FCE178AB2 0xC5DC9DC897A78F1E 0xC5DC9DC897A78F1E
Output 6 0x05DDF19C395C342F 0x05DDF19C395C342F 0x51705B299006C1C1
Output 7 0xD7AFDA4C1654626B 0x89A204161A53CA99 0x89A204161A53CA99
Output 8 0x2C1FCD6649DB8016 0x2C1FCD6649DB8016 0x338710EA7C027382
Output 9 0xEDE28080A228A6DC 0xED272B75717128A4 0xED272B75717128A4
Output 10 0xB05EFD2651E59BC7 0xB05EFD2651E59BC7 0x20911E0B5D6AE551
Output 11 0xD6045D380A566F2D 0xE6584BDE2C470387 0xE6584BDE2C470387
Output 12 0x1BF91549D870370F 0x1BF91549D870370F 0xF434399E25ED24B5
Output 13 0x1D9EB55CB071B60A 0x1DA3A160F1C93F66 0x1DA3A160F1C93F66
Output 14 0x892A16096BF94286 0x892A16096BF94286 0x8F5678078367E87A
Output 15 0xA7292EEE7AF4777F 0x281913D2D3F2DFC5 0x281913D2D3F2DFC5
Output 16 0x08AB29210A0B6FF1 0x08AB29210A0B6FF1 0xC0DD5B88BF6885A3
Output 17 0x20E581A9EA8BF155 0xB17BBB3957418DBB 0xB17BBB3957418DBB
Output 18 0x768E7BD7B5223C36 0x768E7BD7B5223C36 0x086B7F8F81A5CEDA
Output 19 0xF67313E1E5454A96 0xD3026BC096FEDDE2 0xD3026BC096FEDDE2
Output 20 0xAB091F6C4C169476 0xAB091F6C4C169476 0xF29E461BBD7C6952
Output 21 0xB18D1BC34F78E4D1 0x205033BF4C77D237 0x205033BF4C77D237
Output 22 0x5BBCD31929B10ED3 0x5BBCD31929B10ED3 0xFDCB8D05F8C862FD
Output 23 0xC64AFD5DFC7654A7 0x54C875057AAA9DCD 0x54C875057AAA9DCD
Output 24 0x34866A3DF2054E13 0x34866A3DF2054E13 0x4C9D903AB9738D79
Output 25 0x29CA92D156E93DD4 0x1174B427B597C82C 0x1174B427B597C82C
Output 26 0x46B0D6BFFE3BD1C3 0x46B0D6BFFE3BD1C3 0x18A4BCCC15242755
Output 27 0x74DF62C8A26F65CD 0xF5CAC48CFC7B4D47 0xF5CAC48CFC7B4D47
Output 28 0xE737155A33FC916F 0xE737155A33FC916F 0x2A8E83BF16B3C6F5
Output 29 0x0E0E87DB2724235B 0x82EA039F9945EE0D 0x82EA039F9945EE0D
Output 30 0x2367AB4D5FA3ABBE 0x2367AB4D5FA3ABBE 0x274FC0D71A60FF82

Besides the exactly equal ones, the more careful reader will have noticed that each line either has even or odd entries, so the 3 seed/increment pairs generate equal results in the lowest bit.

You might ask why does this all happen? Well, the identical numbers happen because the lowest bit is ored away in DXSM, that is forced to be 1. So any cases differing only by that bit at the LCG stage become identical in later stages. The other similarities happen because the mixing isnt as good as it needs to be to remove all differences from very similar input and LCGs can produce such very similar input when one picks the “wrong” parameters.

Did i hand pick the worst ? Is that the only case ?
lets see

Multiplier 15750249268501108917 15750249268501108917
Increment 123456789 1134925067785824341058664979018534165
Seed 31415 340282366920938463463302549837730314935
0x56 3B0DB76F45EEC2 0x04 3B0DB76F45EEC2
0x06 4C612B5E0A4853 0xC7 4C612B5E0A4853
0x47 27428C93C1A285 0x54 27428C93C1A285
0x27 45482D3BDBC1C0 0xE7 45482D3BDBC1C0
0x60 55DF5FCE178AB2 0x96 55DF5FCE178AB2
0x05 DDF19C395C342F 0xCE DDF19C395C342F
0xD7 AFDA4C1654626B 0xEE AFDA4C1654626B
0x2C 1FCD6649DB8016 0xE2 1FCD6649DB8016
0xED E28080A228A6DC 0xD1 E28080A228A6DC
0xB0 5EFD2651E59BC7 0x75 5EFD2651E59BC7
0xD6 045D380A566F2D 0x03 045D380A566F2D
0x1B F91549D870370F 0xEE F91549D870370F
0x1D 9EB55CB071B60A 0xCB 9EB55CB071B60A
0x89 2A16096BF94286 0x83 2A16096BF94286
0xA7 292EEE7AF4777F 0xCA 292EEE7AF4777F
0x08 AB29210A0B6FF1 0xE1 AB29210A0B6FF1
0x20 E581A9EA8BF155 0x53 E581A9EA8BF155
0x76 8E7BD7B5223C36 0xC8 8E7BD7B5223C36
0xF6 7313E1E5454A96 0x9C 7313E1E5454A96
0xAB 091F6C4C169476 0x19 091F6C4C169476
0xB1 8D1BC34F78E4D1 0x64 8D1BC34F78E4D1
0x5B BCD31929B10ED3 0x70 BCD31929B10ED3
0xC6 4AFD5DFC7654A7 0x59 4AFD5DFC7654A7
0x34 866A3DF2054E13 0xE7 866A3DF2054E13
0x29 CA92D156E93DD4 0x55 CA92D156E93DD4
0x46 B0D6BFFE3BD1C3 0x0F B0D6BFFE3BD1C3
0x74 DF62C8A26F65CD 0x31 DF62C8A26F65CD
0xE7 37155A33FC916F 0xAA 37155A33FC916F
0x0E 0E87DB2724235B 0x67 0E87DB2724235B
0x23 67AB4D5FA3ABBE 0x05 67AB4D5FA3ABBE

Here 7 out of 8 bytes match apparently for each. Is that the worst ? no we could make part of the first byte match too, it just becomes harder to color the table to match that.

Correlations based on the upper half of the LCG

Update 2024-03-02, After thynson noticed that the above 2 classes of correlations, all have multiplier-1 as a coresidual of the increment. Ive looked more and found more classes of correlations.
The above 2 cases produce correlations because they introduce 2 specific classes of differences in the lower half of the underlying LCG, we can also introduce other specific differences in the upper half to create correlations. I first thought the mixing stage would make this difficult, but it doesnt seem that difficult.

Multiplier 15750249268501108917 15750249268501108917 15750249268501108917 15750249268501108917
Increment 123456789 188175727042032768542098945617541451029 94087863521016384271049472808832453909 94087863521016384271049472808832453909
Seed 31415 170141183500083312988819472512656112311 85070591750041656494409736256328071863 255211775210510888226097039972212177591
Output1 0x563B0DB76F45E EC2 0x4EAC9CA5AB826 EC2 0x5273D52E8D642 EC2 0x8BA757339BF3B 0C4
Output2 0x064C612B5E0A4 853 0x507FC7CA7B00C 853 0x7CF75FC4DBFEB B3C 0x6132ADDBCF8F0 853
Output3 0x4727428C93C1A 285 0xE8681791D88ED 60B 0x821B65AAAE9D3 074 0x56EBE12BED519 60B
Output4 0x2745482D3BDBC 1C0 0x7D02EE9F8B954 1C0 0x7C6674F413FF0 1C0 0x064DB7AF79614 8E3
Output5 0x6055DF5FCE178 AB2 0x5284D85445650 AB2 0xCB9C54CE810BC AB2 0x89BF5C086BDE9 447
Output6 0x05DDF19C395C3 42F 0xDA6CAC69AB735 98E 0x979601E7DBD3B 9D1 0x41C4B15A978E7 42F
Output7 0xD7AFDA4C16546 26B 0xD22DEF5B1E600 21E 0x81EE59290B934 21E 0xED6327A2A5747 2A1
Output8 0x2C1FCD6649DB8 016 0x3CD4A9AF3862F 39A 0x12F825A58C46C 679 0x798CE3702E873 39A
Output9 0xEDE28080A228A 6DC 0x8443929D63A67 C1C 0x02CE2330E2663 F1F 0xE422EEFF6C846 6DC
Output10 0xB05EFD2651E59 BC7 0x5BB46EBB794BC 378 0x9EC419F1B9850 378 0x3DD8F2BA415FE 2CE
Output11 0xD6045D380A566 F2D 0x909115D376D6E F2D 0xB34AB985C096A F2D 0xA49969ED9AE37 EBF
Output12 0x1BF91549D8703 70F 0x8B81A72468CE6 2BB 0x05043537DB072 5E2 0x20046693FD23A 2BB
Output13 0x1D9EB55CB071B 60A 0x264C5A17BA703 60A 0x458D7586D42BD 7B7 0x21F587BA3570F 60A
Output14 0x892A16096BF94 286 0x97EA766824D8C 286 0x908A4638C8690 286 0x8B8230900DC31 894
Output15 0xA7292EEE7AF47 77F 0x45B865A22102D 671 0xED910677E00EA 244 0xB9F1AF9C302C3 77F
Output16 0x08AB29210A0B6 FF1 0x66F131DB875BA 123 0xF5F9A395C5C1A FF1 0x035C3B9F6264E A8C
Output17 0x20E581A9EA8BF 155 0x2B7CBD378A1FB 44A 0x835FF11C3FF63 155 0xD6C0C0A58FD86 96A
Output18 0x768E7BD7B5223 C36 0x8282FAEF4390B C36 0x7C88BB637C597 C36 0x4E30CF973257E 263
Output19 0xF67313E1E5454 A96 0xBCADB3276FBCA D92 0x6A4F6314C50F9 11D 0x0DCC176A07468 A96
Output20 0xAB091F6C4C169 476 0x7C6C880401881 476 0x651E3C4FDC40D 476 0x87103D0FEAEBB 5A9
Output21 0xB18D1BC34F78E 4D1 0x182EF712B7CC9 6ED 0x632712BFD8439 536 0x7D1586E9295E2 4D1
Output22 0x5BBCD31929B10 ED3 0x177CCDEF531D3 167 0xC86CF8BD98EEF 167 0x6A7959BA3EC3B 1E5
Output23 0xC64AFD5DFC765 4A7 0xDBFE7B848484D 4A7 0x5124BC71407D9 4A7 0x7317F810CA105 929
Output24 0x34866A3DF2054 E13 0x2B7F37E87E70E BA7 0x87E4C62AA01E0 E13 0xA9077B94F6670 644
Output25 0x29CA92D156E93 DD4 0x3AE2C3E7ABF63 199 0xBCF4EEF02BA57 199 0x190804CAE2166 D48
Output26 0x46B0D6BFFE3BD 1C3 0x714BB6B1E3FC4 96F 0x5E459BFFC60D8 F95 0xB9A7262288870 96F
Output27 0x74DF62C8A26F6 5CD 0x25B37CDE46C28 A09 0x4111E2836E4B0 090 0x01289C6FC845C A09
Output28 0xE737155A33FC9 16F 0x8CB957BC0E0A1 16F 0xAEB26C1EB1942 EC4 0x1475F42946F5D 16F
Output29 0x0E0E87DB27242 35B 0xF94895647AA13 EF5 0xCA2ACE10F845E 35B 0x5234DF04DB0C2 7DE
Output30 0x2367AB4D5FA3A BBE 0x56F8654413BC2 BBE 0xE69CD0B7EE3B4 D08 0x899F4E5205976 BBE

Above attacks the first stage of the mixer, we can also attack the 2nd stage, it looks less impressive but none the less instead of 1 identical byte in 256 we see alot more

Multiplier 15750249268501108917 15750249268501108917 15750249268501108917
Increment 123456789 271162317658692578196538916254707862805 152861171144907760414978380921711611157
Seed 31415 39621334812049856571820243639 9905333703012464142955084471
Output1 0x563B0DB76F45EE C2 0x0083130D590C82 C2 0x56399A48F66C29 C2
Output2 0x064C612B5E0A48 53 0xAFFAF7C625B7A0 55 0x426A227A69AC8B 55
Output3 0x4727428C93C1A2 85 0x3BFE265FC469DD 34 0xDF23B38901C67B 5C
Output4 0x2745482D3BDBC1 C0 0x28B8CF040CE59E 23 0xF2C12F43D3C5BC 98
Output5 0x6055DF5FCE178A B2 0xEB4BAAA8C260D7 AD 0x52224A76A9AB37 DF
Output6 0x05DDF19C395C34 2F 0xD65E464613ADA6 A2 0xCF042916ADA3DB F9
Output7 0xD7AFDA4C165462 6B 0x9FE6DE769FFF8E 1E 0xB816350D4D04B3 1E
Output8 0x2C1FCD6649DB80 16 0x754DC19986C65F 9A 0xCFC556D81AB400 D2
Output9 0xEDE28080A228A6 DC 0xD249BACC4FF71F 15 0xC03E7BF3E7423B A8
Output10 0xB05EFD2651E59B C7 0x794EA34DECBCC4 B4 0x9448CB72344E34 D5
Output11 0xD6045D380A566F 2D 0x2BB1BB3BA4DDCF 4A 0xC2956A532B866C AB
Output12 0x1BF91549D87037 0F 0x8483638585B29A C1 0xE085893BCA2906 D5
Output13 0x1D9EB55CB071B6 0A 0xB54C78D8826582 0A 0x11B567A953D0E2 2F
Output14 0x892A16096BF942 86 0x073360B2311E2E 86 0x0DD1F9F1A63F27 80
Output15 0xA7292EEE7AF477 7F 0x47C425E350061C EF 0x06263999BE6442 AE
Output16 0x08AB29210A0B6F F1 0xB6894B4E36C7FD 52 0xD38C235183235A F9
Output17 0x20E581A9EA8BF1 55 0x08854CD5C6D98B F9 0x0CD106F83B232A 55
Output18 0x768E7BD7B5223C 36 0xA1DDA8C4BE25F5 25 0xD0B32BB399C62F 36
Output19 0xF67313E1E5454A 96 0xCEFA2E1F1711E0 59 0x31AF5438CE9EA2 5F
Output20 0xAB091F6C4C1694 76 0x67110439B8E660 76 0x5FD39652AC17B6 A9
Output21 0xB18D1BC34F78E4 D1 0xBB66DD962205EA F1 0x4BE1680D18C928 B1
Output22 0x5BBCD31929B10E D3 0xFC0D37393C8893 EA 0xEB5BAA64A129E1 D3
Output23 0xC64AFD5DFC7654 A7 0x87EC987780AF59 FA 0xB31BA2E5123C18 CF
Output24 0x34866A3DF2054E 13 0xDB102ADD9EA7D7 0A 0x8885FA71C030D2 E7
Output25 0x29CA92D156E93D D4 0xDCDD9AA636484C 7A 0xDD95314A5E5CA0 8D
Output26 0x46B0D6BFFE3BD1 C3 0x1669656599FB38 9E 0xF0C25ACE4573A6 35
Output27 0x74DF62C8A26F65 CD 0x72DFFFE5F48C19 5B 0xC494B0FF934BE3 46
Output28 0xE737155A33FC91 6F 0xBE560D45FF9E8B C9 0x2D8821EB759B1F 50
Output29 0x0E0E87DB272423 5B 0xCB29008A08DC1A DC 0xD712484F90D6E4 72
Output30 0x2367AB4D5FA3AB BE 0x58E6BE59AE99D8 4B 0x7C289593C90335 F1

We can also use this to generate more than 8bit matches, but they are spread out more so it doesnt show up nicely in the first 30 output values.
For example ./pcg-dxsm-64 15750249268501108917 164824157741578258584156320684186520853 3046549733826659956002159287 1000000 16 and ./pcg-dxsm-64 15750249268501108917 123456789 31415 1000000 16 produce a million values, we expect the low 20 bits to match in one case in a million on average but these have over a thousand such matches

Update 2024-03-03: Verifying that our PCG64-DXSM implementation is correct

./pcg-dxsm-64  15750249268501108917 123456789 31415 10 10
 6213575192384761538
  453844501346797651
 5127139872326394501
 2829747299987210688
 6941700003234024114
  422759593322230831
15541880859405869675
 3179485701156339734
17141404421690271452
12708873539510705095

python3
from randomgen import PCG64
g = PCG64(2**128+31415-123456789,123456788/2,variant="dxsm",mode="legacy"); 
for i in range(10): print(g.random_raw())
6213575192384761538
453844501346797651
5127139872326394501
2829747299987210688
6941700003234024114
422759593322230831
15541880859405869675
3179485701156339734
17141404421690271452
12708873539510705095

Update 2024-03-04: Add correlations with equal seeds

Correlations with equal seeds

For the case of equal seeds it must be first said that different pcg64-dxsm implementations mangle the seed in different ways. Thus whats equal in one is not equal in another. We show examples of equal seeds at the LCG level first, this is how the previous examples defined the seed.

Multiplier 15750249268501108917 15750249268501108917
Increment 340282366920937968669293837469431615317 340282366920937968685044086737932724233
Seed 31415 31415
Output1 0x0000000000000000 0x0000000000000000
Output2 0x0000000000000000 0x5542869E099E356E
Output3 0xCA1A90080FC41D1A 0x7CC2BF122332851C
Output4 0x7CC2BF122332851C 0xCB8D59E2E4C1218A
Output5 0x443C52CD5384C766 0x4E16E6CE9CA5478E
Output6 0x4E16E6CE9CA5478E 0x79339B1DB09444E9
Output7 0x6F763957F4353DDB 0x10EE4E400E4F428E
Output8 0x10EE4E400E4F428E 0x5F9CFCB33A3A60EF
Output9 0x4418DDB3F1FE88B1 0x2B907893DCD59873
Output10 0x2B907893DCD59873 0xC984F2B06253372B
Output11 0xA38FA95F35AB6739 0x3E6AE114E83A3619
Output12 0x3E6AE114E83A3619 0x52946D7362EEAA86
Output13 0x0E4B57A85E8670EA 0x04769CEADA5CC799
Output14 0x04769CEADA5CC799 0x2E71EE38F3683F6C
Output15 0xED4F32C1AF264384 0x54C64408D6FABE08
Output16 0x54C64408D6FABE08 0x6B65E69BB07E56F6
Output17 0xDAD0B79C5C6EFC2A 0xAB06E3184A632E04
Output18 0xAB06E3184A632E04 0x88E36EABEA546B63
Output19 0xE563E5FD7697D091 0x2709664DC434EDE3
Output20 0x2709664DC434EDE3 0x6277238F88A1D8D6
Output21 0x74B88A9EDBCF94FA 0xF4B4B8A667253240
Output22 0xF4B4B8A667253240 0x2B0DF01AF7F68886
Output23 0x23906462E660A212 0x9591B3C5A81F1E01
Output24 0x9591B3C5A81F1E01 0x5A67803048FA88EC
Output25 0x8C3FDB1C4F568714 0xEA48F0F9B2EE760B
Output26 0xEA48F0F9B2EE760B 0xA5785DE15372C0A0
Output27 0x6FB95CE223AFFEE0 0xE7F4EEB0FD473A91
Output28 0xE7F4EEB0FD473A91 0x83886CE76BF8FCDA
Output29 0xAE3D319DFBE64AF6 0xEBA429DA1AF5EE38
Output30 0xEBA429DA1AF5EE38 0x2E10BCE8F08A03B7

And another example:

Multiplier 15750249268501108917 15750249268501108917
Increment 18158801084572694679761040473092597769 340282366920937968670446758974038462293
Seed 31415 31415
Output1 0x2 8C57634CF3ADEA3 0x0 000000000000000
Output2 0x4 5E72764CBAC7D8A 0x1 8C57634CF3ADEA3
Output3 0x0 9F5D8F29B240248 0x0 9804DEEE6EB9D0E
Output4 0x4 8BD93FA680EE36C 0x8 9F5D8F29B240248
Output5 0x8 F2CF0B4671A5D87 0x4 2B52509EB19E6B4
Output6 0xE 57EE9B1AB50A947 0x7 F2CF0B4671A5D87
Output7 0x6 E92305B0D002F61 0xE 02E81E3F013B2B5
Output8 0x2 D513919164C9F11 0x5 E92305B0D002F61
Output9 0x2 FCAB132E105A9F2 0x1 C902225001A774F
Output10 0x6 E8A48BD7E37EE0C 0xC FCAB132E105A9F2
Output11 0x9 A9240D1A1572F73 0x9 64DE7D8E859E804
Output12 0x6 EF0158EC1476E59 0x2 A9240D1A1572F73
Output13 0x9 34A4391489DB1D4 0x4 B66D2865E881D4F
Output14 0xB 634A42C2394888A 0xD 34A4391489DB1D4
Output15 0x8 E03CA0BB984E16B 0x5 5CC7B5271F792FE
Output16 0x5 5F6ACB2C66FDED7 0x5 E03CA0BB984E16B
Output17 0x5 CBEC3EC2C5C03DE 0x3 BF46F487533BDF9
Output18 0x7 C46F2BD428796EA 0xB CBEC3EC2C5C03DE
Output19 0x0 39A4A7C2A084281 0xA 6A3E2D52E24926E
Output20 0x8 1A17AD6D8142476 0xB 39A4A7C2A084281
Output21 0x1 9223A7740BACC82 0x3 5A8AF9350D8645A
Output22 0x8 8DCB82922B55F8E 0x3 9223A7740BACC82
Output23 0x2 FB541FDDC8E5A86 0xE DB016F19FE29D2A
Output24 0xC 61E8DBF6577A704 0xC FB541FDDC8E5A86
Output25 0x8 2733F8F3CD82976 0x8 05A2C8C4F2E88FC
Output26 0x5 5A5329E50934786 0x6 2733F8F3CD82976
Output27 0xA 342EF7724025501 0x2 EBD1F90D6ADC342
Output28 0x1 A8635E212C32A2D 0xD 342EF7724025501
Output29 0x3 F87E671A19911B2 0x1 2B4DEB4213B90C8
Output30 0x7 C8BAC3F018FD61D 0x5 F87E671A19911B2

Now python randomgen + numpy

from randomgen import PCG64
g = PCG64(31415,(310562926860152171542224643668532814649>>1),variant="dxsm",mode="legacy"); g2 = PCG64(31415,(310562926860152171526474394400031705733>>1),variant="dxsm",mode="legacy");
for i in range(30): print("0x" + format(g.random_raw(), "016X") + " 0x" + format(g2.random_raw(), "016X"))
Multiplier 15750249268501108917 15750249268501108917
Increment 310562926860152171526474394400031705733>>1 310562926860152171542224643668532814649>>1
Seed 31415 31415
Output1 0xEE846300E83A27DC 0x7256D1D22864FB31
Output2 0x62B36CDF71DD1743 0x1C2239BDA2F793C6
Output3 0x1C2239BDA2F793C6 0xA25FB4606E9F9955
Output4 0x2C21FF9FA58D879B 0xA90BC860E5BC5E65
Output5 0xA90BC860E5BC5E65 0xD9A217425A7117AF
Output6 0x9B802E5C51C811D5 0x31A30A053121F011
Output7 0x31A30A053121F011 0xDF91893796BE498C
Output8 0xB6F6FCF3CC23CF14 0x27FFF0C3DFDAE842
Output9 0x27FFF0C3DFDAE842 0x5653E5F3D78910B6
Output10 0x043FFA9868AB2B22 0xC77C03211C0201CE
Output11 0xC77C03211C0201CE 0x1A2DC56C688DBDE8
Output12 0x1F3C5871B8AA1618 0x0C12CBDC34DFABB0
Output13 0x0C12CBDC34DFABB0 0xDFD2D074C5127C15
Output14 0xC7FEC3AED543EDE7 0x127120DC1B75A20E
Output15 0x127120DC1B75A20E 0xAA580CD3C44B571D
Output16 0x800C8F5DBF2D5C4B 0xD0D231D10F85BB11
Output17 0xD0D231D10F85BB11 0xB433A15370448948
Output18 0xB678DA426A582A58 0x4E5DC360D27E4857
Output19 0x4E5DC360D27E4857 0x52B4BF17EE5FB665
Output20 0x91A4127B348A8BAB 0xAD7A55AF7A1E62B2
Output21 0xAD7A55AF7A1E62B2 0xAAE73D4A877AEB9E
Output22 0xE6F6BA90233BFF2A 0x537BA3170D10A98F
Output23 0x537BA3170D10A98F 0x71778DBD0A1F1858
Output24 0x22D07B79FDA625E8 0xE03DFB31EDF18C5A
Output25 0xE03DFB31EDF18C5A 0x610809577FF8326B
Output26 0x82BA957BB887F8A1 0x5C430AB7C74B61D9
Output27 0x5C430AB7C74B61D9 0x2C63932EC021D830
Output28 0x0FD2B2B66AF18DD0 0x779C96BF20FAAC65
Output29 0x779C96BF20FAAC65 0x93C6F9F9FA2026BD
Output30 0xFCB0833058A7183F 0x5397DA646711A886
g = PCG64(31415,(328721727944725361000316204835460899641>>1),variant="dxsm",mode="legacy"); g2 = PCG64(31415,(310562926860152171527627315904638552709>>1),variant="dxsm",mode="legacy");
for i in range(30): print("0x" + format(g.random_raw(), "016X") + " 0x" + format(g2.random_raw(), "016X"))
Multiplier 15750249268501108917 15750249268501108917
Increment 328721727944725361000316204835460899641>>1 310562926860152171527627315904638552709>>1
Seed 31415 31415
Output1 0x0 70E695F12FAC872 0x3 847B59A89C311E2
Output2 0x6 C673BB08B64385F 0xB 565D0C976C04FB6
Output3 0x8 D353EED117D7BC2 0xF C673BB08B64385F
Output4 0x9 EA14B4ED2206371 0x0 06B1E786214329E
Output5 0xB FE0FB0D66377C3C 0xE EA14B4ED2206371
Output6 0x8 F48FEB9E21CF90C 0xB C9512DDD19B53D4
Output7 0x9 FE6F7ED18A90B3E 0xC F48FEB9E21CF90C
Output8 0x8 554B31E0632BF9F 0x2 82E54E6060CFF52
Output9 0xB D2444268D86C2E0 0xF 554B31E0632BF9F
Output10 0x6 A77908E23FC25DB 0xE BE330DEB6411006
Output11 0x7 F5401944A241692 0xB A77908E23FC25DB
Output12 0x3 78E7CFCA99985EC 0x6 1E2C802A1208A6E
Output13 0xE B3C49C3FC1B80BD 0xF 78E7CFCA99985EC
Output14 0x0 CE124E21F2F325C 0xA F5923BC63DFE91F
Output15 0xF 2DAAFA5A3C0D13C 0x4 CE124E21F2F325C
Output16 0x1 4B999F605D4C93B 0xD 00E469E490FE6A4
Output17 0x3 4CAE614E3FBF077 0xC 4B999F605D4C93B
Output18 0x3 BA4BA78F570240F 0xE 5AFD46A52CC2195
Output19 0xD 935DBA5BB7508AA 0x6 84B89E8D66E1609
Output20 0xE B3D1F1A9262C709 0x7 8C2397180142176
Output21 0xB 7AEF7982643BADC 0xB B3D1F1A9262C709
Output22 0x1 C258C8417669CD1 0x2 542033E81E98834
Output23 0x4 81A95DC0F79E1A6 0xC C258C8417669CD1
Output24 0x1 227C217600F6ECC 0xF 3E4B6B862CC276A
Output25 0x6 F0860C9B8E5E021 0xD 227C217600F6ECC
Output26 0xC 72BDAFE8A09D355 0x5 3D268E5FE4D6083
Output27 0x2 69759DA86213109 0x7 72BDAFE8A09D355
Output28 0x8 46C4CD6EC69A47D 0xB D9032549CBE8217
Output29 0xE 3FC3F0A98C99CF1 0x1 46C4CD6EC69A47D
Output30 0xC 7C7EC7A33D0292F 0x3 4D9727EA1FBECFB

An example of the upper part of the LCG in PCG64-DXSM being modified with identical seed in numpy/randomgen

g = PCG64(31415,(292528383318202715988945170563770901305>>1),variant="dxsm",mode="legacy"); g2 = PCG64(31415,(140421743439297021051919259480919575173>>1),variant="dxsm",mode="legacy");
for i in range(30): print("0x" + format(g.random_raw(), "016X") + " 0x" + format(g2.random_raw(), "016X"))
Multiplier 15750249268501108917 15750249268501108917
Increment 292528383318202715988945170563770901305>>1 140421743439297021051919259480919575173>>1
Seed 31415 31415
Output1 0x97F9738805252 FBF 0x3A48F167D2DF3 6B7
Output2 0x964B11CABAA4E 501 0x88DECF43C2B52 E8D
Output3 0xB6AC157505409 594 0x73D73CEDB2296 501
Output4 0xCEC2A042E362F 2A4 0xE36B6340C1CDC FC3
Output5 0xD0A2096F73ED9 41A 0x28A909ED3D7C7 2A4
Output6 0x7127C22885CC5 4A3 0x2F3751079E58B CB4
Output7 0x90A13F2459C9B 6CE 0x5EB33E911942C F94
Output8 0x496FE1695C009 C55 0xE05801B1AA7F2 AF6
Output9 0x80126568D4AA5 FDF 0x8204715BB013E 169
Output10 0x85BC9BD98E5DE FA4 0x53456ECD3C705 21D
Output11 0xA7380BD92D338 178 0x901391E7BC83E 69A
Output12 0x3E324902274F1 0C2 0xC1698DE771445 A88
Output13 0xA5584406CB702 00A 0x54EC4E180FAC9 0C2
Output14 0x736CBD359BEE5 8B5 0xF43324BCA3F47 26E
Output15 0x7F72C7D6A6C6B 778 0x48C873AED7047 0E2
Output16 0x7CAD149EE868F 197 0x6F3A19369FE36 048
Output17 0x30B0B3F452E1F 475 0x31B0038A96D97 197
Output18 0x36514492F155F FD8 0x2586DFDD55D21 2EF
Output19 0x0CE19FA5F24DA C42 0x88058C6DBD897 FD8
Output20 0xE5C9DECAE4A12 C0D 0x5040EC24F3386 95E
Output21 0xE43F8183F92C1 450 0x064D651C1466D 7C3
Output22 0x4E84BB152C42A D6C 0x36D952B70BFA0 870
Output23 0xA4CD58BA22302 BB5 0x3C74E5804A742 D6C
Output24 0x360F2AB984C9C 065 0x33E5305BEC81D 936
Output25 0x50B420A99712E 267 0x0E3141A4260BF B79
Output26 0xD5F0C6071AE99 AEB 0xA9AEDA617F487 C03
Output27 0x8354DDF7179B0 638 0x4027BB1478F1E C11
Output28 0xCE21CF50C0F0E D3F 0xAA188AB00050F 05E
Output29 0xD08A6974E1276 809 0x21370D1D12353 313
Output30 0x2D33EC6185398 A0C 0x5F439A9086378 B2C
Filed under: Cryptanalysis,Pseudo random number generators — Michael @ 18:08

Breaking pcg-dxsm (no implementation)

All PCG implementations seem to allow more or less easy computation of the seed, but a new PCG implementation has appeared in 2019 or so.
PCG-DXSM, the source code tells us “This is a new, more powerful output permutation” and “This latter aspect means that the scrambling applied to the high bits depends on the low bits, and makes it (to my eye) impractical to back out the permutation without having the low-order bits.”
Ok. lets try
Heres for reference, what this is about:
multiplier is a odd 64bit constant, SHR1 is half the state bits, SHR2 is quarter the state bits and SHR3 is 3 eights of the state so 64, 32, 48 for the 128/64 bit variant and 32, 16, 24 for the 64/32bit one. inc is forced to be odd too.

    state = state * multiplier + inc;

    l1 = state | 1;
    h1 = state >> SHR1;
    h2 = h1 ^ (h1 >> SHR2);
    h3 = h2 * multiplier;
    h4 = h3 ^ (h3 >> SHR3);
    return h4 * l1;

l1 is forced to be odd as we can see, so what do we know if pcg_dxsm returns 0 ? When is X * (2Y+1) = 0 ? Yes when X is 0 so h4 has to be 0 (yes i jump over some details of rings) and what can h3 be ? only 0, and h2 ? again only 0 and h1 ? yes its 0.
So when pcg_dxsm() return 0 you just learned half its internal state.
This can be extended a bit, for example when it returns a odd number h4 must be odd, when it returns a number that contains a factor of 2x then so must h4.

Can we compute the seed and increment from this ? I did not try, but i think so.
Just consider this: we decide how many bits we want to look for in the data and we brute force the rest. (there quite possibly could be an easier way, thats just the obvious one)
if we want 16bits to be from the data we would look for about 100k output data values, look for the values that are multiplies of 65536, these points would have h4 a multiple of 65536 and thus the low 17bits of h4 would be known. If we now brutefoce the 32-17 that is 15 low bits of the state and increment, these together fully determine, first all the low 15bits of all states and together with that the cases where we know the low 17 bits of h4 we now can invert the multiply and find the rest of h4. For each known h4 its trivial to find the upper half of the state. And the rest is identical to finding the state and increment of a truncated LCG and after the upper bits are found the lower can be trivially found from the h4*l1.
As i have not implemented this it is very possible that i have missed something or made some mistake.

Btw i think PCG and other PRNGs are really great little math exercises. They are also good PRNGs as long as one understands their limitations.

Filed under: Cryptanalysis,Pseudo random number generators — Michael @ 02:44

2024-01-16

simple and good PRNGs

As ive looked at many PRNGs in the last few days, heres a quick list/note about which seem good.
To be on this list they must be

  • Fast
  • Fail 0 general statistical tests
  • 64bit output
  • minimum period of at least ~264

I tried a bit to create a list of diverse types of generators. If i hear about any of these generators failing some tests, i will remove them (unless i forget), i also may add more in case i see others in the future. The list is in alphabetical order. None of the generators is secure, and i expect that the state can be found from the output with little effort for every fast PRNG. Some of these generators allow efficient seeking. For others I do not know how to efficiently seek.
Also see https://pracrand.sourceforge.net/RNG_engines.txt, https://prng.di.unimi.it/,

Name state period seekable Class
MWC256 256bit ~2255 yes multiplicative congruential generator with prime modulus
SFC64 256bit ≥264 ? chaotic RNG + counter
splitmix64 64bit 264 yes counter + mixer
xoroshiro128** 128bit 2128-1 yes LFSR + mixer
XSM64 256bit 2128 yes LCG + mixer

This omits basic LCGs, LFGs, LFSRs as well as simple variants of these because they fail tests. PRNGs which are similar to the listed ones but worse in one way and equivalent in all others are omitted too. Cryptographic PRNGs are omitted because they are slow.

SFC64 and XSM64 are testet here The others in the table ive tested with practrand to 32TB:
mwc256_test,
splitmix64_test,
xoroshiro128starstar_test
also several of them have been tested here,

Filed under: Pseudo random number generators — Michael @ 22:09

2024-01-12

SFC64

As i was looking for the best PRNG ;) i found SFC64, its fast, seems to pass all tests, and it was fun to find a way to compute the seed from a few output values.
But the more one looks the more one finds, there are many other great PRNGs. I would stay away from hyped up generators, like the mersene twister was in the distant past.

Now about SFC64, heres a simple implementation. It can also compute values backward.

And for fun, heres an implementation that will turn 10+ values back into the 3 seeds and counter used to create them.
The way this works is that first we guess the 16-19 MSB of the a or b register. Next we assume that the corresponding MSB of the counter are 0. Because its 64bits, thats reasonable and pretty difficult to be otherwise. From this and a = b ^ (b >> 11) and tmp = a +b + counter++; we can estimate teh MSBs of the future and past a and b registers (they slowly become less exact due to the counter but thats insignificant over the distance we need them)
For the 8 rotation states we then invert b = c + (c << 3); for the MSBs we have (which has 9 solutions and iterating over them gives us 3 extra bits knowledge of c). The 8 rotation states are iterated over in 2 forward steps of 3 followed by a backward step of 5. This way we minimize the distance we move away from the start point which limits effects of the unknown LSBs and also limits the needed amount of data. In addition the 3,3,-5 pattern splits the previously determined part of the c register so that the lower part of the MSB becomes shifted into the MSB. This allows us to check the lower part of our data from the c register together with the MSB of the b register at that point and the known output data. This is crucial to limit the otherwise exponential increase of computations with each part of c that is iterated over. Towards the end of the c register, we then compute the a,b and counter from it and the known output and if it matches, backstep to counter =1 to output the seed at the start. It would also be possible to use a heuristic in guessing the initial a/b values, so as to not search the 17-19 bit space but it seems the bad ones get rejected naturally quickly so this lead to little gain and its already quite fast.
I do realize after writing above that this description is probably hard to understand. I should expand this and draw diagramms, but ATM i have no time so please read the source :)

Example: of finding the seeds after 1000 rounds

./sfc64 0x123456785 0x978653 0xCCAADDEE0F 1000 30 16 
0x8A9912C022EA7402
0x26971AE36A1CEACA
0x97B4CF575274B9F4
0xC9212C12EEE58E49
0x9B45979745BB2E20
0x857B5EEB83507D6D
0x243E719FF5356874
0xE6EB4E22DDDEDCA6
0xAF0DC4AF0234A763
0x9946B8BD21C71BE9
0x14D42990A607161B
0x66A19FF0771F930C
0xE3E580CA96D08FD3
0xDAED73C40D44E397
0x90A0E96731283123
0x183075440F78DA33
0xE46ACCC1AECD9C6C
0x1CA82D6053C47933
0xC707C633E34377CA
0xF60FC6DB95DE2631
0x0B2342552B1B4912
0xC32F27E766B148FD
0x055FDF9DC44345CF
0xADFD892D38628512
0x5AA6881F0C6E3ADE
0x72FD99BE0899B07C
0x6F31D2773AB970F6
0xC07850CC6EFD7868
0x5EC7E6C020A407CA
0xF048FCEDEA87BE0A


time ./sfc64-breach  0x8A9912C022EA7402 0x26971AE36A1CEACA 0x97B4CF575274B9F4 0xC9212C12EEE58E49 0x9B45979745BB2E20 0x857B5EEB83507D6D  0x243E719FF5356874 0xE6EB4E22DDDEDCA6 0xAF0DC4AF0234A763 0x9946B8BD21C71BE9 0x14D42990A607161B 0x66A19FF0771F930C 0xE3E580CA96D08FD3 0xDAED73C40D44E397 0x90A0E96731283123  0x183075440F78DA33 0xE46ACCC1AECD9C6C 0x1CA82D6053C47933 0xC707C633E34377CA 0xF60FC6DB95DE2631 0x0B2342552B1B4912 0xC32F27E766B148FD 0x055FDF9DC44345CF 0xADFD892D38628512 0x5AA6881F0C6E3ADE 0x72FD99BE0899B07C 0x6F31D2773AB970F6 0xC07850CC6EFD7868 0x5EC7E6C020A407CA 0xF048FCEDEA87BE0A
attempt 0
Found (a=0x0000000123456785 b=0x0000000000978653 c=0x000000CCAADDEE0F counter=0x0000000000000001 original_counter=1002
step 5121675

real	0m0.083s
user	0m0.071s
sys	0m0.012s

Filed under: Cryptanalysis,Pseudo random number generators — Michael @ 02:09

2024-01-08

Finding the increment of PCG/LCG/truncated LCGs and some (for me) surprising consequences about their structure.

For plain LCG the problem is trivial and an implementation shown in my last post. PCG falls in 2 categories, the PCG-XSL-RR which is based on an LCG, this was solved in the Paper by Bouillaguet, Martinez and Sauvage.
The remaining PCG generators are different and simpler, they are based on an truncated LCG. The problem of finding an increment and seed is marginally harder as it is for the corresponding truncated LCG. It can be done using Lattice reduction (even some half bastardized version seems to work). The problem and the surprise for me really is that the solution in general is not unique. I guess it was a surprise after reading in the original PCG paper how additional random streams can be created by changing the increment. This is really not true. Changing the increment does not really produce “different” random data, it just offsets it.

Let me first show an example with a plain LCG (using the multiplier 6364136223846793005 from PCG with 64bit state)

Seed 314159 314160
Increment 1442695040888963407 13525302890751722019
Output0 1758213515780721042 1758213515780721043
Output1 8369501526408240633 8369501526408240634
Output2 14286244372924288276 14286244372924288277
Output3 12908628925188209107 12908628925188209108
Output4 17643264774698309734 17643264774698309735
Output5 10761879962653451581 10761879962653451582
Output6 11615400545074122760 11615400545074122761
Output7 5243018629741398711 5243018629741398712
Output8 1055069346400300154 1055069346400300155
Output9 14628082432468752577 14628082432468752578

This is not a odd exception, no, for every choosen LCG and choosen offset (here it is +1) theres a seed and increment that will produce a offset sequence of random numbers.
You can use lcg-breach to find them by just asking it to find parameters for a modified sequence.
Now thats LCGs, What about truncated LCGs ? its worse because they discard the low bits. You get sequences that can be exactly equal for MANY values even with different increment and seed. And that is what the problem is in determining the increment for truncated LCGs and the corresponding PCGs. For PCG-XSL-RR the low LCG bits are XORed in so there remains a effect on the output and its possible to determine the increment but for 128/64 and 64/32 PCG-XSH-RR the low bits of the state do not affect the output as long as their effect stays limited to these low 27bit (for 64/32 bit) in future states they cannot be computed from the output. Or in other words the output from 2 PCG-XSH-RR with differing increments can be identical for a long time.

On top of that, there is also a set of seed,increment pairs that produce constant output.
consider this:

LCG         : out[i+1] = out[i] * m + a mod n
what we want: C        = C      * m + a mod n

just solve it:

C - C * m = a mod n

choose the constant C you want and this will tell you the increment a that will keep the generator at that constant when its started on it.

and how do we now create a offseted LCG ?
simply add it with the constant above

out[i+1]     =  out[i]      * m + A     mod n
           C =           C  * m +     a mod n
out[i+1] + C = (out[i] + C) * m + A + a mod n

And yes LCGs of the same multipler and modulo seem to form an abelian group with addition.

Filed under: Cryptanalysis,Pseudo random number generators — Michael @ 18:06

2024-01-06

Breaking LCGs

Just for fun, heres a program to generate random numbers using a Linear Congruential Generator from a specified multiplier, increment, seed and number of bits.
And a second one that will tell you the multiplier, increment, seed for a given 3 output numbers from above.

Have fun encrypting important messages


./lcg 64 6364136223846793005 1442695040888963407 314159 5
1758213515780721042
8369501526408240633
14286244372924288276
12908628925188209107
17643264774698309734

$ ./lcg-breach 64 1758213515780721042 8369501526408240633 14286244372924288276
Seed: 314159 0x000000000004CB2F
Multiplier: 6364136223846793005 0x5851F42D4C957F2D
Increment: 1442695040888963407 0x14057B7EF767814F

Filed under: Cryptanalysis,Pseudo random number generators — Michael @ 01:00

2024-01-04

Breaking the PCG-XSH-RR PRNG

How did this start ? Someone found a bug or 2 in the libavutil/eval.c random(), so i proposed a patch replacing it by Marsaglias 64-bit KISS PRNG. Nicklas Haas then ask me why iam not using PCG yesterday. And i didnt know really what PCG is :) so i looked found the paper found a wegpage that was down the exact day i tried to look at it (yeah its up again now). And a wikipedia page taking about “It has been shown that it is practically possible (with a large computation) to recover the seed of the pseudo-random generator given 512 consecutive output bytes.”. That triggered me a bit ;)

The paper about PCG is 58 Pages so reading that is not an option. But we skim it and look at the pretty pictures, theres one that marks several PCG variants as insecure. Luckily the one listed at wikipedia (PCG-XSH-RR is not marked as insecure) In fact there are 2 PCG-XSH-RR, one with 64bit state and 32bit output and one with 128bit state and 64bit output. So i tried them. Theres also a paper about breaking PCG-XSL-RR which differs by some shifts it seems from XSH. That paper talks about also recovering the increment and that for that it needs “512 bytes of challenge input;in the worst case, the process takes 20 000 CPU hours” ugh, i dont have that. Luckily we know the increment as its not a variable in the implementation and with that, the paper tells us “breaking XSL-RR needs 23 core minutes”. But we took XSH-RR because it was on wikipedia :) and the original paper also recommends it

6.3.1 32-bit Output, 64-bit State: PCG-XSH-RR
Here the design goal is to be a good all-purpose random number generator. The intent
is to balance speed with statistical performance and reasonable security, charting a
middle-of-the-road path. (It’s the generator that I recommend for most users.)”

So we are not totally wasting our time but we can have fun exercising breaking a “not insecure” PRNG with “reasonable security”.

A few hours later, am i lucky ? am i smart ? or is the PCG-XSH-RR a insecure generator, i dont know, I think i maybe picked the easier one but still. These following need 3-4 output values from the PCG-XSH-RR to find and verify its state, from which all future and past output can be computed.
For the 64/32bit variant we compute all tables at runtime, for the 128/64 variant we hardcode the 152 entry table needed as it needs a few minutes to be computed. But the table doesnt depend on the seed nor on the increment, just a new multiplier would need a new table.
The actual execution for both including loading the program, generating the random numbers finding the seed again and printing it all together takes less than 100ms
Code is here


time ./pcg-xsh-rr-64-32-breach $((0x1234578ACDEF11))
INV 0xC097EF87329E28A5 0x1 gcd=0x1
Seed = 0x1234578ACDEF11
PCG output 0 = 0x90DA0D79
PCG output 1 = 0xCF0EAEE6
PCG output 2 = 0x423D36BE
PCG output 3 = 0xAE152565
PCG output 4 = 0x2C48B4D4
PCG output 5 = 0x9EE6CC50
PCG output 6 = 0x903FC56A
PCG output 7 = 0xC85B3AED
PCG output 8 = 0x87B29579
PCG output 9 = 0xA486F671
Table 0 0x00000000000000005851F42D4C957F2D 0x0000000000000001
Table 1 0x000000000000000008F5DC87E5C07D87 0x0000000000000003
Table 2 0x00000000000000000148A921ACEF6819 0x000000000000001D
Table 3 0x00000000000000000006C363D4CB5B28 0x00000000000000C8
Table 4 0x00000000000000000002BCFA0DFD0A8F 0x000000000000262B
Table 5 0x00000000000000000001738A552BC485 0x00000000000071B9
Table 6 0x000000000000000000002A1A9C5A7E7B 0x000000000000BD47
Table 7 0x0000000000000000000007652A02ADCE 0x00000000000635C6
Table 8 0x0000000000000000000002445FB59459 0x000000000024855D
Table 9 0x0000000000000000000001AC54D3A396 0x00000000008BDFAE
Table 10 0x00000000000000000000011449F1B2D3 0x0000000000F339FF
Table 11 0x00000000000000000000007C3F0FC210 0x00000000015A9450
Table 12 0x000000000000000000000060733D935D 0x00000000031C82F1
Table 13 0x000000000000000000000044A76B64AA 0x0000000004DE7192
Table 14 0x000000000000000000000028DB9935F7 0x0000000006A06033
Breaching spin:.........................................................................................................................................................................................................................................................................................................................
Candidate 0x48EAFC5649EE9492 for 90DA0D79 CF0EAEE6 423D36BE
found Seed = 0x1234578ACDEF11

real 0m0.074s
user 0m0.074s
sys 0m0.000s


And the PCG-XSH-RR-128/64 one:

time ./pcg-xsh-rr-128-64-breach $((0x1234578ACDEF11))
INV 0x98ABC8B0716EAC8D 0x1 gcd=0x1
Seed = 0x1234578ACDEF11
PCG output 0 = 0xE49F66304A892198
PCG output 1 = 0x58C80AA8C2F10A26
PCG output 2 = 0xD2B380859B808368
PCG output 3 = 0x60715FDB8EE68956
PCG output 4 = 0xB0D0AAF3073F2EDA
PCG output 5 = 0xC5876DE56C99EDB3
PCG output 6 = 0x68371CCDC061F3D
PCG output 7 = 0x7118EC9199028ED5
PCG output 8 = 0xAC0DAAF55D9A009C
PCG output 9 = 0x709AD75080EDE32E

Breaching spin:.................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
Candidate 0x5928FB686BFAEF01 for E49F66304A892198 58C80AA8C2F10A26 D2B380859B808368...............................................................................................................................................................................................................................................................................................................................................................................................................................................
Breaching spin:...................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
Candidate 0xF0BA68A1B616E6B1 for 58C80AA8C2F10A26 D2B380859B808368 60715FDB8EE68956
found Seed = 0x1234578ACDEF11

real 0m0.089s
user 0m0.069s
sys 0m0.020s

Maybe its a great PRNG, i dont know i did not look at that aspect, but its not secure if i can break it after a few hours work :)

Filed under: Cryptanalysis,Pseudo random number generators — Michael @ 21:58

2023-05-18

Hardware password managers, accelerometers and random data

I was looking for a way to securely manage my passwords, anything storing my passwords with a 3rd party or all accessible to my computer fail IMHO. Also it had to be practical, something limited to 5 passwords is not. Be convenient, something where one would have to search for a password in a list by hand fails that. And it needed to have a screen so one knew what one approved. A simple one button USB stick does not qualify either as one does ultimately not know what one actually approved. The only device i could find was the mooltipass. So i bought one. It is a cute little thing storing all passwords both securely and also conveniently. And being a nerdy geeky person, i of course had to play with and analyze it. The first thing that sprang out to me was its random number generator

Can we exploit it ?

So i looked at its random number generator until i found something. It uses the 2 LSB of each of the 3 axis of an LIS2HH12 accelerometer. This generator was tested with the DIEHARDER battery of tests by its creator before me. I was at first not able to run the DIEHARDER tests because they needed more random data than the device could generate quickly. I tried various tests, simple things like simply trying to compress the data showed no anomalies. The first anomaly i found, i believe was through looking at the frequency spectrum in audacity. To my eye it looked like there was a bias between high and low frequencies. The next step was checking the correlation of various bits. And indeed when one looked at 2 bits and the previous 2 bits from the same channel They where equal 3% more often than they should be. I guess i could have stoped here, but i didn’t :) So i looked at the distribution of matching bits, (3% of these shouldn’t be there and we dont know which of 100 are the 3 bad ones). These 3 could be randomly distributed of course. But by now i had enough data to run some of DIEHARDER and while most tests passed, some of its tests failed for me. I have to say though little things in the command line of DIEHARDER can lead to unexpected results.

The original findings on the 3% anomaly

I simply counted the number of times each 2bit matches the previous
2bits of the same channel and this occurs about 3% more often than it should.

With 1 mb of data:
Channel 0 [1.026829 : 1.980986 : 0.992185]
Channel 1 [1.031329 : 1.978283 : 0.990388]
Channel 2 [1.039171 : 1.974176 : 0.986653]
Average   [1.032443 : 1.977815 : 0.989742]
All 3     [0.968019 1.010415 1.052310 1.111489]

/dev/random
Channel 0 [1.000765 : 1.998530 : 1.000705]
Channel 1 [0.997429 : 2.001644 : 1.000927]
Channel 2 [0.997357 : 2.003375 : 0.999268]
Average   [0.998517 : 2.001183 : 1.000300]
All 3     [1.001617 0.999233 0.997958 0.995425

This allows relatively reliably distingishing the mooltipass random numbers from /dev/random with 10kb of data

Periodic anomalies

Looking again at the randomdata. The 3% extra repeats within a channel occur 32 samples apart (that is 24 bytes in the stream or 192 bits). These locations sometimes shift around but preferably occur at indexes 31,63 and 95 of the 96 sample block. When such a run of anomalies breaks from teh same index, the new index tends to be the next in the set {95,63,31}. With these patterns it is possible to reliably distinguish as little as 100-200 bytes from random data. That said, the randomness of this is still plenty for a password and the average human would be way worse generating random data. Care though may need to be applied if this is used for other purposes than passwords. For example DSA signatures are notorious for being sensitive to biases in the used random number generator. I reported the anomalies in the RNG in January of 2023. It was fixed on Apr 18th 2023 with 49359dfc52cdfe743000ac512092085328d5f15b. Software to detect the specific pattern reliably as well as 2 small test samples is availble at randomtests

The original findings on perodic anomalies

Heres how this compares to /dev/random
dd if=/dev/random of=/dev/stdout count=1 bs=1000 status=none | ./mooltitestwalker
mooltiness:     0.89 expected: < 1 in 68% cases, < 2 in 95%, < 3 in 99.7%

dd if=/dev/random of=/dev/stdout count=1 bs=10000 status=none | ./mooltitestwalker
mooltiness:     0.25 expected: < 1 in 68% cases, < 2 in 95%, < 3 in 99.7%

dd if=/dev/random of=/dev/stdout count=1 bs=100000 status=none | ./mooltitestwalker
mooltiness:     0.04 expected: < 1 in 68% cases, < 2 in 95%, < 3 in 99.7%

dd if=/dev/random of=/dev/stdout count=1 bs=1000000 status=none | ./mooltitestwalker
mooltiness:     2.99 expected: < 1 in 68% cases, < 2 in 95%, < 3 in 99.7%

dd if=/dev/random of=/dev/stdout count=1 bs=10000000 status=none | ./mooltitestwalker
mooltiness:     0.08 expected: < 1 in 68% cases, < 2 in 95%, < 3 in 99.7%

and now the random data from mooltipass

dd if=~/mooltirandom.bin-copy of=/dev/stdout count=1 bs=1000 status=none | ./mooltitestwalker
mooltiness:     3.16 expected: < 1 in 68% cases, < 2 in 95%, < 3 in 99.7%

dd if=~/mooltirandom.bin-copy of=/dev/stdout count=1 bs=10000 status=none | ./mooltitestwalker
mooltiness:     4.21 expected: < 1 in 68% cases, < 2 in 95%, < 3 in 99.7%

dd if=~/mooltirandom.bin-copy of=/dev/stdout count=1 bs=100000 status=none | ./mooltitestwalker
mooltiness:    12.91 expected: < 1 in 68% cases, < 2 in 95%, < 3 in 99.7%

dd if=~/mooltirandom.bin-copy of=/dev/stdout count=1 bs=1000000 status=none | ./mooltitestwalker
mooltiness:    37.46 expected: < 1 in 68% cases, < 2 in 95%, < 3 in 99.7

dd if=~/mooltirandom.bin-copy5 of=/dev/stdout count=1 bs=10000000 status=none | ./mooltitestwalker
mooltiness:   115.95 expected: < 1 in 68% cases, < 2 in 95%, < 3 in 99.7%

As you can see with just 1000 bytes we are already more than
3 standard deviations away from random data. And after that it very quickly
becomes something that a random number generator would not generate in the lifetime of
the universe not even if you fill the whole universe with random number generators should
you see this sort of statistics once

Heres the test results for 100,200,300,400,500 bytes with mooltitestcycler

for i in `seq 100 100 500` ; do dd if=~/mooltirandom.bin-copy of=/dev/stdout bs=$i count=1 status=none | ./mooltitestcycler ; done
moolticycles:     3.00, this or larger is expected 0.27 % of the time in random data.
moolticycles:     4.58, this or larger is expected 0.00046 % of the time in random data.
moolticycles:     6.00, this or larger is expected 2E-07 % of the time in random data.
moolticycles:     6.93, this or larger is expected 4.3E-10 % of the time in random data.
moolticycles:     7.75, this or larger is expected 9.5E-13 % of the time in random data.

Same but with another testsample to make sure this is not a one off

for i in `seq 100 100 500` ; do dd if=~/mooltirandom2.bin of=/dev/stdout bs=$i count=1 status=none | ./mooltitestcycler ; done
moolticycles:     3.00, this or larger is expected 0.27 % of the time in random data.
moolticycles:     4.58, this or larger is expected 0.00046 % of the time in random data.
moolticycles:     6.00, this or larger is expected 2E-07 % of the time in random data.
moolticycles:     6.93, this or larger is expected 4.3E-10 % of the time in random data.
moolticycles:     7.75, this or larger is expected 9.5E-13 % of the time in random data.

In fact i notice now the results are exactly the same, interresting
cmp  ~/mooltirandom2.bin ~/mooltirandom.bin-copy
/home/michael/mooltirandom2.bin /home/michael/mooltirandom.bin-copy differ: byte 1, line 1


heres the control test with /dev/random
for i in `seq 100 100 500` ; do dd if=/dev/random  of=/dev/stdout bs=$i count=1 status=none | ./mooltitestcycler ; done
moolticycles:     0.00, this or larger is expected 1E+02 % of the time in random data.
moolticycles:     0.31, this or larger is expected 76 % of the time in random data.
moolticycles:     0.58, this or larger is expected 56 % of the time in random data.
moolticycles:     0.44, this or larger is expected 66 % of the time in random data.
moolticycles:     0.77, this or larger is expected 44 % of the time in random data.

and a bigger sample to show the runs:

dd if=~/mooltirandom.bin-copy5 of=/dev/stdout bs=2000000 count=1 status=none | ./mooltitestcycler
Run 1731 at phase 31
Run 7496 at phase 95
Run 3323 at phase 63
Run  196 at phase 31
Run  195 at phase 95
Run 3540 at phase 63
Run 2459 at phase 31
Run  777 at phase 95
Run  775 at phase 63
Run  195 at phase 31
Run  583 at phase 95
Run  778 at phase 63
Run 1585 at phase 31
Run 1126 at phase 95
Run  971 at phase 63
Run 5047 at phase 31
Run 7141 at phase 95
Run 1741 at phase 63
Run 4643 at phase 31
Run  197 at phase 95
Run  577 at phase 63
Run  197 at phase 31
Run 2907 at phase 95
Run 4040 at phase 63
Run 1572 at phase 31
Run  196 at phase 95
Run 2903 at phase 63
Run 1355 at phase 31
Run  195 at phase 95
Run  196 at phase 63
Run  584 at phase 31
Run  143 at phase 95
Run   51 at phase 63
Run  195 at phase 31
Run 3099 at phase 95
Run  196 at phase 63
Run 1164 at phase 31
Run 2445 at phase 95
Run 2589 at phase 63
Run 4553 at phase 31
Run 7145 at phase 95
moolticycles:   499.67, this or larger is expected 0 % of the time in random data.

Shaken not Stirred

A 2nd issue i found and reported on 7th feb 2023 is that when the device is violently shaken, the accelerometer saturates and clips at 2g, this makes the 2 LSB of the affected channel(s) be 00 or 11 more often than expected in a random stream of data. So please dont use the mooltipass while doing high G maneuvers in a fighter jet or any other activity that subjects it to high g forces. Also with some effort one can shake a 3 axis accelerometer so that all 3 axis clip at the same time. There also may be a delay between the generation and use of random data. Ideally when the full 16bit sample clips it should be discarded and not used. While discarding one kind of sample that was equally frequent, introduces a bias, the clipping cases are not equally frequent. They either do not occur at all in a still environment or occur disproportionally frequent.

Random now?

Except these, i found no further flaws in the random data. Personally i would recommend to use some sort of hash or other cryptography to mix up the accelerometer bits. Heating or cooling (in my freezer with long USB cable) the device did not introduce measurable bias and also feeding ~60 mega byte stream of data into the PRNG NIST-Statistical-Test-Suite did not show any anomalies after the fix. Of course one can only find flaws in such data streams, never proof the absence of flaws. Also it must be said that i could not find any reference by the manufacturer of the chip to randomness of the LSBs. So while AFAIK many devices use accelerometer data as source of random data, there seems no guarantee that a future revision of that chip doesn't produce less random bits. What i can say, is thus just about the device i looked at and for that, the random data is much better than what a person would generate by "randomly" pressing keys. People are very biased in what characters they use in passwords even when they try to be random. Also passwords are generally too short for this anomaly to allow distinguishing 1 password from another with a unbiased RNG. A password that one believed contained n*192 bits of entropy really only contained n*190 before the fix. All this assumes the device is not violently shaken while used.

Audacity spectra

Took me a bit to find and restore the original spectrum i saw. While doing that i also noticed that using signed 8bit results in a spectrum biased the other way. The original ones are unsigned 8bit.

mooltipass pre fix spectrum /dev/random control spectrum mooltipass post fix spectrum

Audacity spectra with more data and in signed 8bit ints

recreated mooltipass pre fix spectrum /dev/random spectrum mooltipass post fix spectrum

Dieharder testsuite

Ill write a separate article about dieharder later, the thing is finicky and this is already quite complex

Updated this article multiple times (last on 2023-05-24) to include more details, pictures and minor corrections
Full disclosure, i was offered a reward for finding and reporting the bug.

Filed under: Cryptanalysis,Hardware,Off Topic,Security — Michael @ 22:52

2023-02-15

EU Schuko plug

Today i learned a new feature of the EU Schuko plug.
I moved some stuff to a different apartment, plugged it in and slowly build the feeling something wasn’t quite right. Some electronics behaved a bit odd, somehow grounded things didn’t feel quite grounded.
So i connected a DMM between the grounds of 2 plugs

and

and between we have

ehm, 126 Volts
On closer inspection we can see the ground lead on the wall outlet has been painted over. Sure thats a rare exception, lets look at the other lead

sure its not the 2nd plug too that our green wire eventually was connected to

sure its not all plugs on that wall

whatever, cleaning that off and retesting


Problem solved, lets pretend we didnt notice this EU population control feature ;)
Having the ground leads exposed so that morons can paint over them and then anything pluged in simply has no ground connection. Bad design, simply bad design.

Filed under: Uncategorized — Michael @ 00:00
Next Page »

Powered by WordPress