[Python-ideas] PEP 504: Using the system RNG by default

Fri Sep 25 06:02:30 CEST 2015

[Tim]
> ...
> "Password generators" should be the least of our worries.  Best I can
> tell, the PHP paper's highly technical MT attack against those has
> scant chance of working in Python except when random.choice(x) is
> known to have len(x) a power of 2.  Then it's a very powerful attack.

Ha!  That's actually its worse case, although everyone missed that.

I wrote a solver, and bumped into this while testing it.  The rub is
this line in _randbelow():

            k = n.bit_length()  # don't use (n-1) here because n can be 1

If n == 2**i, k is i+1 then, and ._randbelow() goes on to throw away
half of all 32-bit MT outputs.  Everyone before assumed it wouldn't
throw any away.

The best case for this kind of solver is when .choice(x) has len(x)
one less than a power of 2, say 2**i - 1.  Then k = i, and
._randbelow() throws away 1 of each of 2**i MT outputs (on average).

For small i (say, len(x) == 63), every time I tried then the solver
(which can only record bits from MT outputs it _knows_ were produced)
found itself stuck with inconsistent equations.

If len(x) = 2**20 - 1, _then_ it has a great chance of succeeding.
There's about a chance in a million then that a single .choice() call
will consume 2 32-bit MT outputs,

It takes around 1,250 consecutive observations (of .choice() results)
to deduce the starting state then, assuming .choice() never skips an
MT output.  The chance that no output was in fact skipped is about:

>>> (1 - 1./2**20) ** 1250
0.9988086167972104

So that attack is very likely to succeed.

So, until the "secrets" module is released, and you're too dense to
use os.urandom(), don't pick passwords from a million-character
alphabet ;-)