[Python-ideas] Should our default random number generator be secure?

Tue Sep 15 20:05:44 CEST 2015

...

[Nathaniel Smith <njs at pobox.com>]
>>> Here you go:
>>>   https://media.blackhat.com/bh-us-12/Briefings/Argyros/BH_US_12_Argyros_PRNG_WP.pdf

[Tim,
 on appendix D]
>> ...
>> My question is, even if /dev/urandom is available, they're _not_
>> content to use that.alone.  They continue to mix it up with all other
>> kinds of silly stuff.  So why do they trust urandom less than
>> OpenSLL's gimmick?

[Nathaniel]
> Who knows why they wrote the code in that exact way. /dev/urandom is fine.

Presumably _they_ know, yes?  When they're presenting a "gold
standard" all-purpose token generator, it would be nice if they had
taken care to make every step clear to other crypto wonks.  Otherwise
the rest of us are left wondering if _anyone_ really knows what
they're talking about ;-)

[on the MT state solver]
>> ...
>> It remained unclear to me _exactly_ what "in the presence of non
>> consecutive outputs" is supposed to mean.  In the only examples, they
>> knew exactly how many times MT was called.  "Non consecutive" in all
>> those contexts appeared to mean "but we couldn't observe_any_ output
>> bits in some cases - the ones we could know something about were
>> sometimes non-consecutive".  So in the MT output sequence, they had no
>> knowledge of _some_ of the outputs, but they nevertheless knew exactly
>> _which_ of the outputs they were wholly ignorant about.
>>
>> That's no problem for the equation solver.  They just skip adding any
>> equations for the affected bits, keep collecting more outputs and
>> potentially "wrap around", probably leading to an overdetermined
>> system in the end.
>>
>> But Python doesn't work the way PHP does here.  As explained in
>> another message, in Python you can have _no idea_ how many MT outputs
>> are consumed by a single .choice() call.  In the PHP equivalent, you
>> always consume exactly one MT output.  PHP's method suffers
>> statistical bias, but under the covers Python uses an accept/reject
>> method to avoid that.  Any number of MT outputs may be (invisibly!)
>> consumed before "accept" is reached, although typically only one or
>> two.  You can deduce some of the leading MT output bits from the
>> .choice() result, but _only_ for the single MT output .choice()
>> reveals anything about.  About the other MT outputs it may consume,
>> you can't even know that some _were_ skipped over, let alone how many.

> This led me to look at the implementation of Python's choice(), and
> it's interesting; I hadn't realized that it was using such an
> inefficient method.

Speed is irrelevant here, but do note that .choice() isn't restricted
to picking from less than 2**32 possibilities.

>>> random.choice(range(2**62))
2693408174642551707

Special-casing is done at Python speed, and adding a Python-level
branch to ask "is it less than 2**32?" is typically more expensive
than calling MT again.

> (To make a random selection between, say, 36
> items, it rounds up to 64 = 2**6, draws a 32-bit sample from MT,
> discards 26 of the bits (!) to get a number between 0-63, and then
> repeats until this number happens to fall in the 0-35 range, so it
> rejects with probability ~0.45. A more efficient algorithm is the one
> that it uses if getrandbits is not available, where it uses all 32
> bits and only rejects with probability (2**32 % 36) / (2**32) =
> ~1e-9.) I guess this does add a bit of obfuscation.

And note that the branch you like better _also_ needs another
Python-speed test to raise an exception if the range is "too big".
The branch actually used doesn't need that.

> OTOH the amount of obfuscation is very sensitive to the size of the
> password alphabet. If I use uppercase + lowercase + digits, that gives
> me 62 options, so I only reject with probability 1/32, and I can
> expect that any given 40-character session key will contain zero skips
> with probability ~0.28, and that reveals 240 bits of seed.

To be quite clear, nothing can ever be learned about "the seed".  All
that can be observed is bits from the outputs, and the goal is to
deduce "the state".  There is an invertible permutation mapping a
state word to an output word, but from partial knowledge of N < 32
bits from a single output word you typically can't deduce N bits of
the state word from which the output was derived (for example, if you
only know the first bit of an output, you can't deduce from that alone
the value of any bit in the corresponding state word).  That's why
they need such a hairy framework to begin with (in the example, you
_can_ add equations tying the known value of the first output bit to
linear functions of all bits of the state related to the first output
bit).

About the "240 bits", they need about 80 times more than that to
deduce the state.  0.28 ** 80 is even less than a tenth ;-)

> I don't have time right now to go look up the MT equations to see how
> easy it is to make use of such partial information,

It's robust against only knowing a subset of an output's bits
(including none).  It's not robust against not knowing _which_ output
you're staring at.

They label the state bits with variables x_0 through x_19936, and
generate equations relating specific state bits to the output bits
they can deduce.  If they don't know how many times MT was invoked
between outputs they know were generated, they can't know _which_
state-bit variables to plug into their equations.  Is this output
related to (e.g.) at least x_0 through x_31, or is it x_32 through
x_63?  x_64 through x_95?

Take an absurd extreme to illustrate an obvious futility in general:
suppose .choice() remembered the size of the last range it was asked
to pick from.  Then if the next call to .choice() is for the
same-sized range, call MT 2**19937-2 times ignoring the outputs, and
call it once more.  It will get the same result then.  "The solver"
will deduce exactly the same output bits every time, and will never
learn more than that.  Eventually, if they're doing enough sanity
checking, the best their solver could do is notice that the derived
equations (regardless of how they construct them) are inconsistent.
The worst it could do is "deduce" a state that's pure fantasy.  They
can't know that they are in fact seeing an output derived from exactly
the same state every time.  Unless they read the source code for
.choice() and see it's playing this trick.  in that case, they would
never add to their collection of equations after the first output was
dealt with.  _Then_ the equations would faithfully reflect the truth:
that they learned a tiny bit at first, but never learn more than just
that.

>There but there certainly are lots of real-world weaponized exploits that begin with
> something like "first, gather 10**8 session keys...". I certainly
> wouldn't trust it.

And I'm not asking you to.  I wouldn't either.  I'm expressing
skepticism about that the solver in this paper is a slam-dunk proof
that all existing idiomatic Python password generators are about to
cause the world to end ;-)

> Also, if I use the base64 or hex alphabets, then the probability of
> rejection is 0, and I can deterministically read off bits from the
> underlying MT state.

I' readily agree that if .choice(x) is used whenever len(x) is a power
of two, then their solver applies directly to such cases.  It's in all
and only such cases they can know exactly how many MT outputs were
consumed, and so know also which specific state-bit variables to use.

> (Alternatively, if someone in the future makes
> the obvious optimization

As above, if what you like better were obviously faster in reality, it
would have been written that way already ;-)  Honest, this looks like
Raymond Hettinger's code, and he typically obsesses over speed.

> to choice(), then it will basically stop rejecting in practice, and again
> it becomes trivial to read off all the bits from the underlying MT state.)

Not trivial.  You still need this hairy solver framework, and best I
can tell its code wasn't made available.  Note that the URL given in
the paper gives a 404 error now.  It isn't trivial to code it either.

> The point of "secure by default" is that you don't have to spend all
> these paragraphs doing the math to try and guess whether some RNG
> usage might maybe be secure; it just is secure.

No argument there!  I'm just questioning how worried people "should
be" over what actual Python code actually does now.  I confess I
remain free of outright panic ;-)

>> Best I can tell, that makes a huge difference to whether their solver
>> is even applicable to cracking idiomatic "password generators" in
>> Python.  You can't know which variables correspond to the bits you can
>> deduce.  You could split the solver into multiple instances to cover
>> all feasible possibilities (for how many MT outputs may have been
>> invisibly consumed), but the number of solver instances needed then
>> grows exponentially with the number of outputs you do see something
>> about.  In the worst case (31 bits are truncated), they need over
>> 19000 outputs to deduce the state.  Even a wildly optimistic "well,
>> let's guess no more than 1 MT output was invisibly rejected each time"
>> leads to over 2**19000 solver clones then.

> Your "wildly optimistic" estimate is wildly conservative under
> realistic conditions.

Eh.

> How confident are you that the rest of your
> analysis is totally free of similar errors? Would you willing to bet,
> say, the public revelation of every website you've visited in the last
> 5 years on it?

I couldn't care less if that were revealed.  In fact, I'd enjoy the
trip down memory lane ;-)

> ...
> Grant money is a drop in the bucket of security research funding these
> days. Criminals and governments have very deep pockets, and it's well
> documented that there are quite a few people with PhDs who make their
> living by coming up with exploits and then auctioning them on the
> black market.

Excellent points!  Snideness doesn't always pay off for me ;-)

> BTW, it looks like that PHP paper was an undergraduate project. You
> don't need a PhD to solve linear equations :-).

So give 'em their doctorates!  I've seen doctoral theses a hundred
times less substantial ;-)

> ...
> No, SystemRandom.choice is certainly fine. But people clearly don't
> use it, so it's fine-ness doesn't matter that much in practice...

It's just waiting for a real exploit.  People writing security papers
love to "name & shame".  "Gothca!  Gotcha!"  Once people see there
_is_ "a real problem" (if there is), they'll scramble to avoid being
the target of the next name-&-shame campaign.  Before then, they're
much too busy trying to erase all traces of every website they've
visited in the last 5 years ;-)