Re: [Python-Dev] BDFL ruling request: should we block forever waiting for high-quality random bits?

11 Jun 2016

...
On Jun 11, 2016, at 3:40 PM, Guido van Rossum <guido@python.org> wrote:
On Sat, Jun 11, 2016 at 11:30 AM, Donald Stufft <donald@stufft.io <mailto:donald@stufft.io>> wrote:
...
On Jun 11, 2016, at 1:39 PM, Guido van Rossum <guido@python.org <mailto:guido@python.org>> wrote:
Is the feature detection desire about being able to write code that runs on older Python versions or for platforms that just don't have getrandom()?
My assumption was that nobody would actually use these flags except the secrets module and people writing code that generates long-lived secrets -- and the latter category should be checking platform and versions anyway since they need the whole stack to be secure (if I understand Ted Ts'o's email right).
My assumption is also that the flags should be hints (perhaps only relevant on Linux) -- platforms that can't perform the action desired (because their system's API doesn't support it) would just do their default action, assuming the system API does the best it can.
The problem is that someone writing software that does os.urandom(block=True) or os.urandom(exception=True) which gets some bytes doesn’t know if it got back cryptographically secure random because Python called getrandom() or if it got back cryptographically secure random because it called /dev/urandom and that gave it secure random because it’s on a platform that defines that as always returning secure or because it’s on Linux and the urandom pool is initialized or if it got back some random bytes that are not cryptographically secure because it fell back to reading /dev/urandom on Linux prior to the pool being initialized.
The “silently does the wrong thing, even though I explicitly asked for it do something different” is something that I would consider to be a footgun and footgun’s in security sensitive code make me really worried.
Yeah, but we've already established that there's a lot more upset, rhetoric and worry than warranted by the situation.
Have we? There are real, documented security failures in the wild because of /dev/urandom’s behavior. This isn’t just a theoretical problem, it actually has had consequences in real life, and those same consequences could just have easily happened to Python (in one of the cases that most recently comes to mind it was a C program, but that’s not really relevant because the same problem would have happened if they had written in Python using os.urandom in 3.4 but not in 3.5.0 or 3.5.1.
...
Outside of the security side of things, if someone goes “Ok I need some random bytes and I need to make sure it doesn’t block”, then doing ``os.random(block=False, exception=False)`` isn’t going to make sure that it doesn’t block except on Linux.
To people who "just want some random bytes" we should recommend the random module.
In other words, it’s basically impossible to ensure you get the behavior you want with these flags which I feel like will make everyone unhappy (both the people who want to ensure non-blocking, and the people who want to ensure cryptographically secure). These flags are an attractive nuisance that look like they do the right thing, but silently don’t.
OK, it looks like the flags just won't make you happy, and I'm happy to give up on them. By default the status quo will win, and that means neither these flags nor os.getrandom(). (But of course you can roll your own using ctypes. :-)
Meanwhile if we have os.urandom that reads from /dev/urandom and os.getrandom() which reads from blocking random, then we make it both easier to ensure you get the behavior you want, either by using the function that best suits your needs:
* If you just want the best the OS has to offer, os.getrandom falling back to os.urandom.
Actually the proposal for that was the secrets module. And the secrets module would be the only user of os.urandom(blocking=True).
I’m fine if this lives in the secrets module— Steven asked for it to be an os function so that secrets.py could continue to be pure python.
...
* If you want to ensure you get cryptographically secure bytes, os.getrandom, falling back to os.urandom on non Linux platforms and erroring on Linux.
"Erroring" doesn't sound like it satisfies the "ensure" part of the requirement. And I don't see the advantage of os.getrandom() over the secrets module. (Either way you have to fall back on os.urandom() to suppport Python 3.5 and before.)
Erroring does satisfy the ensure part, because if it’s not possible to get cryptographically secure bytes then the only option is to error if you want to be ensured of cryptographically secure bytes.

It’s a bit like if you did open(“somefile.txt”), it’s reasonable to say that we should ensure that open(“somefile.txt”) actually opens ./somefile.txt, and doesn’t randomly open a different file if ./somefile.txt doesn’t exist— if it can’t open ./somefile.txt it should error. If I *need* cryptographically secure random bytes, and I’m on a platform that doesn’t provide those, then erroring is often times the correct behavior. This is such an important thing that OS X will flat out kernel panic and refuse to boot if it can’t ensure that it can give people cryptographically secure random bytes.

It’s a fairly simple decision tree, I go “hey, give me cryptographically secure random bytes, and only cryptographically secure random bytes”. If it cannot give them to me because the APIs of the system cannot guarantee they are cryptographically secure then there are only two options, either A) it is explicit about it’s inability to do this and raises an error or B) it does something completely different than what I asked it to do and pretends that it’s what I wanted.
...
* If you want to *ensure* that there’s no blocking, then os.urandom on Linux (or os.urandom wrapped with timeout code anywhere else, as that’s the only way to ensure not blocking cross platform).
That's fine with me.
* If you just don’t care, YOLO it up with either os.urandom or os.getrandom or random.random.
Now you're just taking the mickey.
No I’m not— random.Random is such a use case where it wants to seed with as secure of bytes as it can get it’s hands on, but it doesn’t care if it falls back to insecure bytes if it’s not possible to get secure bytes. This code even falls back to using time as a seed if all else fails.
...
...
I think the problem with making os.urandom() go back to always reading /dev/urandom is that we've come to rely on it on all platforms, so we've passed that station.
Sorry, to be more specific I meant the 3.4 behavior, which was open(“/dev/urandom”).read() on *nix and CryptGenRandom on Windows.
I am all for keeping it that way. The secrets module doesn't have to use any of these, it can use an undocumented extension module for all I care. Or it can use os.urandom() and trust Ted Ts'o.
-- 
--Guido van Rossum (python.org/~guido <http://python.org/~guido>)
—
Donald Stufft