[Python-Dev] BDFL ruling request: should we block forever waiting for high-quality random bits?

Thu Jun 9 12:30:02 EDT 2016

On Thu, Jun 09, 2016 at 08:26:20AM -0400, Donald Stufft wrote:

> random.py
> ---------
> 
> In the abstract it doesn't hurt to seed MT with a CSPRNG, it just doesn't
> provide much (if any) benefit and in this case it is hurting us because of the
> cost on import (which will exist on other platforms as well no matter what we
> do here for Linux). There are a couple solutions to this problem:
> 
> * Use getrandom(GRND_NONBLOCK) for random.Random since it doesn't matter if we
>   get cryptographically secure random numbers or not.

+1 on this option (see below for rationale).

> * Switch it to use something other than a CSPRNG by default since it doesn't
>   need that.
[...]
> Between these options, I have a slight preference for switching it to use a non
> CSPRNG, but I really don't care that much which of these options we pick. Using
> random.Random is not secure and none of the above options meaningfully change
> the security posture of something that accidently uses it.

I don't think that is quite right, although it will depend on your 
definition of "meaningful".

PEP 506 says:

    Demonstrated attacks against MT are typically against PHP 
    applications. It is believed that PHP's version of MT is a 
    significantly softer target than Python's version, due to
    a poor seeding technique [17] . 

https://www.python.org/dev/peps/pep-0506/#id17

specifically that PHP seeds the MT with the time, while we use the 
output of a CSPRNG. Now, we all agree that MT is completely the wrong 
thing to use for secrets, good seeding or not, but *bad* seeding could 
make it a PHP-level soft target.

The point of PEP 506 is to move people away from using random.Random for 
their secrets, but we should expect that whatever we do, there will be 
some late adopters who are slow to get the message and continue to use 
it. I would not like us to weaken the seeding technique to the point 
that those folks become an attractive target.

I think that using getrandom(GRND_NONBLOCK) will be okay, provided that 
when the entropy pool is too low and getrandom falls back to something 
cryptographically weak, it's still better (hopefully significantly 
better) than seeding with the time.

My reasoning is that the sort of applications that could be targets of 
attacks against MT are unlikely to be started up early in the boot 
process, so they're almost always going to get good crypto seeds. On the 
rare occasion that they don't, well, there's only so far that I'm 
prepared to stand up for developer's right to be ignorant of security 
concerns in 2016, and that's where I draw the line.

> SipHash and the Interpreter Startup
> -----------------------------------
[...]
> In the end, both of these choices make me happy and unhappy in different ways
> but I would lean towards adding a CLI flag for the special case and letting the
> systemd script that caused this problem invoke their Python with that flag. I
> think this because:
> 
> * It leaves the interpreter so that it is secure by default, but provides the
>   relevant knobs to turn off this default in cases where a user doesn't need
>   or want it.
> * It solves the problem in a cross platform way, that doesn't rely on the
>   nuances of the CSPRNG interface on one particular supported platform.

Makes sense to me.

+1

> os.urandom
> ----------
[...]
> With that in mind, I think that we should, to the best of our ability given the
> platform we're on, ensure that os.urandom does not return bytes that the OS
> does not think is cryptographically secure.

Just to be clear, you're talking about having it block rather than raise 
an exception, right?

If so, that makes sense to me. That's already the behaviour on all major 
platforms except Linux, so you're just bringing Linux into line with the 
others. Those who want the non-blocking behaviour on Linux can just read 
from /dev/urandom.

+1

-- 
Steve