On Wed, Sep 16, 2015 at 8:47 AM, Tim Peters
[Guido]
There's still way too much chatter, and a lot that seems just rhetoric. This is not the republican primaries.
Which is a shame, since the chatter here is of much higher quality than in the actual primaries ;-)
Yes lots of companies got hacked. What's the evidence that a language's default RNG was involved?
Nobody cares whether there's evidence of actual harm. Just that there _might_ be, and even if none identifiable now, then maybe in the future.
There is evidence of actual harm from RNGs doing poor _seeding_ by default, but Python already fixed that (I know, you already know that ;-) ).
And this paper, from a few years ago, studying RNG vulnerabilities in PHP apps, is really good:
https://media.blackhat.com/bh-us-12/Briefings/Argyros/BH_US_12_Argyros_PRNG_...
An interesting thing is that several of the apps already had a history of trying to fix security-related holes related to RNG (largely due to PHP's poor default seeding), but remained easily cracked.
The primary recommendation there wasn't to make PHP's various PRNGs "crypto by magic", but for core PHP to supply "a standard" crypto RNG for people to use instead. As above, some of the app developers already knew darned well they had a history of RNG-related holes, but simply had no standard way to address it, and didn't have the _major_ expertise needed to roll their own.
IIUC the best practice for password encryption (to make cracking using a large word list harder) is something called bcrypt; maybe next year something else will become popular, but the default RNG seems an unlikely candidate. I know that in the past the randomness of certain protocols was compromised because the seeding used a timestamp that an attacker could influence or guess. But random.py seeds MT from os.urandom(2500). So what's the class of vulnerabilities where the default RNG is implicated?
1. Users doing their own poor seeding.
2. A hypothetical MT state-deducer (seemingly needing to be considerably more sophisticated than the already mondo sophisticated one in the paper above) to be of general use against Python.
3. "Prove there can't be any in the future. Ha! You can't." ;-)
Tim's proposal is simple: create a new module, e.g. safefandom, with the same API as random (less seed/state). That's it. Then it's a simple import change away to do the right thing, and we have years to seed StackOverflow with better information before that code even hits the road. (But a backport to Python 2.7 could be on PyPI tomorrow!)
Which would obviously be fine by me: make the distinction obvious at import time, make "the safe way" dead easy and convenient to use, give it anew name engineered to nudge newbies away from the "unsafe" (by contrast) `random`, and a new name easily discoverable by web search.
There's something else here: some of these messages gave pointers to web pages where "security wonks" conceded that specific uses of SystemRandom were fine, but they couldn't recommend it anyway because it's too hard to explain what is or isn't "safe". "Therefore" users should only use urandom() directly. Which is insane, if for no other reason than that users would then invent their own algorithms to convert urandom() results into floats and ints, etc. Then they'll screw up _that_ part.
But if "saferandom" were its own module, then over time it could implement its own "security wonk certified" higher level (than raw bytes) methods. I suspect it would never need to change anything from what the SystemRandom class does, but I'm not a security wonk, so I know nothing. Regardless, _whatever_ changes certified wonks deemed necessary in the future could be confined to the new module, where incompatibilities would only annoy apps using that module. Ditto whatever doc changes were needed. Also gone would be the inherent confusion from needing to draw distinctions between "safe" and "unsafe" in a single module's docs (which any by-magic scheme would only make worse).
However, supplying a powerful and dead-simple-to-use new module would indeed do nothing to help old code entirely by magic. That's a non-goal to me, but appears to be the _only_ deal-breaker goal for the advocates.
Which is why none of us is the BDFL ;-)
So if you or someone else (Chris?) wrote that up in PEP form I'd accept it. I'd even accept adding a warning on calling seed() (but not setstate()). -- --Guido van Rossum (python.org/~guido)