# [Python-ideas] Should our default random number generator be secure?

Tim Peters tim.peters at gmail.com
Sat Sep 12 05:23:42 CEST 2015

```[Marc-Andre]
...
>> While implementing this, I found that there's a bit more trickery
>> involved due to the fact that the MT RNG in Python writes the
>> 624 words internal state in batches - once every 624 times
>> the .getrandbits() function is called.
>>
>> So you may need up to 624*2 - 1 output values to determine a
>> correct array of internal state values.

[Tim]
> Don't be too sure about that.  From an information-theoretic view,
> "it's obvious" that 624 32-bit outputs is enough - indeed, that's 31
> more bits than the internal state actually has. You don't need to
> reproduce Python's current internal MT state exactly, you only need to
> create _a_ MT state that will produce the same values forever after.
> Specifically, the index of the "current" vector element is an artifact
> of the implementation, and doesn't need to be reproduced.  You're free
> to set that index to anything you like in _your_ MT state - the real
> goal is to get the same results.

Concrete proof of concept.  First code to reconstruct state from 624
consecutive 32-bit outputs:

def invert(transform, output, n=100):
guess = output
for i in range(n):
newguess = transform(guess)
if newguess == output:
return guess
guess = newguess
raise ValueError("%r not invertible in %s tries" %
(output, n))

t1 = lambda y: y ^ (y >> 11)
t2 = lambda y: y ^ ((y << 7) & 0x9d2c5680)
t3 = lambda y: y ^ ((y << 15) & 0xefc60000)
t4 = lambda y: y ^ (y >> 18)

def invert_mt(y):
y = invert(t4, y)
y = invert(t3, y)
y = invert(t2, y)
y = invert(t1, y)
return y

def guess_state(vec):
assert len(vec) == 624
return (3,
tuple(map(invert_mt, vec)) + (624,),
None)

Now we can try it:

import random
for i in range(129):
random.random()

That loop was just to move MT into "the middle" of its internal
vector.  Now grab values:

vec = [random.getrandbits(32) for i in range(624)]

Note that the `guess_state()` function above _always_ sets the index
to 624.  When it becomes obvious _why_ it does so, all mysteries will
vanish ;-)

Now create a distinct generator and force its state to the deduced state:

newrand = random.Random()
newrand.setstate(guess_state(vec))

And some quick sanity checks:

for i in range(1000000):
assert random.random() == newrand.random()
for i in range(1000000):
assert random.getrandbits(32) == newrand.getrandbits(32)

The internal states are _not_ byte-for-byte identical.  But they don't
need to be.  The artificial `index` bookkeeping variable allows
hundreds of distinct spellings of _semantically_ identical states.
```