Re: [Python-ideas] PEP 504: Using the system RNG by default

16 Sep 2015

      On Wed, Sep 16, 2015 at 8:47 AM, Tim Peters  wrote:
...
[Guido]
...
There's still way too much chatter, and a lot that seems just rhetoric.
This
is not the republican primaries.
Which is a shame, since the chatter here is of much higher quality
than in the actual primaries ;-)
...
Yes lots of companies got hacked. What's the evidence that a language's
default RNG was involved?
Nobody cares whether there's evidence of actual harm.  Just that there
_might_ be, and even if none identifiable now, then maybe in the
future.
There is evidence of actual harm from RNGs doing poor _seeding_ by
default, but Python already fixed that (I know, you already know that
;-) ).
And this paper, from a few years ago, studying RNG vulnerabilities in
PHP apps, is really good:
https://media.blackhat.com/bh-us-12/Briefings/Argyros/BH_US_12_Argyros_PRNG_...
An interesting thing is that several of the apps already had a history
of trying to fix security-related holes related to RNG (largely due to
PHP's poor default seeding), but remained easily cracked.
The primary recommendation there wasn't to make PHP's various PRNGs
"crypto by magic", but for core PHP to supply "a standard" crypto RNG
for people to use instead.  As above, some of the app developers
already knew darned well they had a history of RNG-related holes, but
simply had no standard way to address it, and didn't have the _major_
expertise needed to roll their own.
...
IIUC the best practice for password encryption (to
make cracking using a large word list harder) is something called bcrypt;
maybe next year something else will become popular, but the default RNG
seems an unlikely candidate. I know that in the past the randomness of
certain protocols was compromised because the seeding used a timestamp
that
an attacker could influence or guess. But random.py seeds MT from
os.urandom(2500). So what's the class of vulnerabilities where the
default
RNG is implicated?
1. Users doing their own poor seeding.
2. A hypothetical MT state-deducer (seemingly needing to be
   considerably more sophisticated than the already mondo
   sophisticated one in the paper above) to be of general use
   against Python.
3. "Prove there can't be any in the future.  Ha!  You can't." ;-)
...
Tim's proposal is simple: create a new module, e.g. safefandom, with the
same API as random (less seed/state). That's it. Then it's a simple
import
change away to do the right thing, and we have years to seed
StackOverflow
with better information before that code even hits the road. (But a
backport
to Python 2.7 could be on PyPI tomorrow!)
Which would obviously be fine by me:  make the distinction obvious at
import time, make "the safe way" dead easy and convenient to use, give
it anew name engineered to nudge newbies away from the "unsafe" (by
contrast) `random`, and a new name easily discoverable by web search.
There's something else here:  some of these messages gave pointers to
web pages where "security wonks" conceded that specific uses of
SystemRandom were fine, but they couldn't recommend it anyway because
it's too hard to explain what is or isn't "safe".  "Therefore" users
should only use urandom() directly.  Which is insane, if for no other
reason than that users would then invent their own algorithms to
convert urandom() results into floats and ints, etc.  Then they'll
screw up _that_ part.
But if "saferandom" were its own module, then over time it could
implement its own "security wonk certified" higher level (than raw
bytes) methods.  I suspect it would never need to change anything from
what the SystemRandom class does, but I'm not a security wonk, so I
know nothing.  Regardless, _whatever_ changes certified wonks deemed
necessary in the future could be confined to the new module, where
incompatibilities would only annoy apps using that module.  Ditto
whatever doc changes were needed.  Also gone would be the inherent
confusion from needing to draw distinctions between "safe" and
"unsafe" in a single module's docs (which any by-magic scheme would
only make worse).
However, supplying a powerful and dead-simple-to-use new module would
indeed do nothing to help old code entirely by magic.  That's a
non-goal to me, but appears to be the _only_ deal-breaker goal for the
advocates.
Which is why none of us is the BDFL ;-)
So if you or someone else (Chris?) wrote that up in PEP form I'd accept it.

I'd even accept adding a warning on calling seed() (but not setstate()).

-- 
--Guido van Rossum (python.org/~guido)