[Python-ideas] PEP 504: Using the system RNG by default

Nick Coghlan ncoghlan at gmail.com
Wed Sep 16 17:54:24 CEST 2015


On 17 September 2015 at 00:26, Guido van Rossum <guido at python.org> wrote:
> There's still way too much chatter, and a lot that seems just rhetoric. This
> is not the republican primaries.

There was still a fair bit of useful feedback in there, so I pushed a
new version of the PEP that addresses it:

* the submodule idea is gone
* the module level API still delegates to random._inst at call time
rather than import time
* random._inst is a SystemRandom() instance by default
* there's a  new ensure_repeatable() API to switch it back to random.Random()
* seed(), getstate() and setstate() all implicitly call ensure_repeatable()
* the latter issue a warning recommending calling ensure_repeatable() explicitly

The key user experience difference from the status quo is that this
allows the "not suitable for security purposes" warning to be moved to
a section specifically covering ensure_repeatable(), seed(),
getstate() and setstate() rather than automatically applying to the
entire random module.

The reason it becomes reasonable to move the warning is that it
changes the failure mode from "any use of the module API for security
sensitive purposes" is a problem to "any use of the module API for
security sensitive purposes is a problem if the application also calls
random.ensure_repeatable()".

> Yes lots of companies got hacked. What's the evidence that a language's
> default RNG was involved? IIUC the best practice for password encryption (to
> make cracking using a large word list harder) is something called bcrypt;
> maybe next year something else will become popular, but the default RNG
> seems an unlikely candidate. I know that in the past the randomness of
> certain protocols was compromised because the seeding used a timestamp that
> an attacker could influence or guess. But random.py seeds MT from
> os.urandom(2500). So what's the class of vulnerabilities where the default
> RNG is implicated?

Reducing the search space for brute force attacks on things like:

* randomly generated default passwords
* password reset tokens
* session IDs

The PHP paper covered an attack on password reset tokens.

Python's seeding is indeed much better, and Tim's mathematical skills
are infinitely better than mine so I'm never personally going to win a
war of equations with him. If you considered a conclusive proof of a
break specifically targeting *CPython's* PRNG essential before
considering changing the default behaviour (even given the almost
entirely backwards compatible approach I'm now proposing), I'd defer
the PEP with a note suggesting that practical attacks on security
tokens generated with CPython's PRNG may be a topic of potential
interest to the security community.

The PEP would then stay deferred until someone actually did the
research and demonstrated a practical attack.

> Tim's proposal is simple: create a new module, e.g. safefandom, with the
> same API as random (less seed/state). That's it. Then it's a simple import
> change away to do the right thing, and we have years to seed StackOverflow
> with better information before that code even hits the road. (But a backport
> to Python 2.7 could be on PyPI tomorrow!)

If folks are reaching for a third party library anyway, we'd be better
off point them at one of the higher levels ones like passlib or
cryptography.

There's also the aspect that something I'd now like to achieve is to
eliminate the security warning that is one of the first things people
currently see when they open up the random module documentation:
https://docs.python.org/3/library/random.html

While I think that warning is valuable given the current default
behaviour, it's also inherently user hostile for beginners that
actually *do* read the docs, as it raises questions they don't know
how to answer: "The pseudo-random generators of this module should not
be used for security purposes. Use os.urandom() or SystemRandom if you
require a cryptographically secure pseudo-random number generator."

Switching the default means that the question to be asked is instead
"Do you need repeatability?", which is *much* easier question, and we
only need to ask it in the documentation for ensure_repeatable() and
the related functions that call that implicitly.

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


More information about the Python-ideas mailing list