On 16 June 2016 at 18:03, Nick Coghlan <ncoghlan@gmail.com> wrote:
On 16 June 2016 at 09:39, Paul Moore <p.f.moore@gmail.com> wrote:
I'm willing to accept the view of the security experts that there's a problem here. But without a clear explanation of the problem, how can a non-specialist like myself have an opinion? (And I hope the security POV isn't "you don't need an opinion, just do as we say").
If you're not writing Linux (and presumably *BSD) scripts and applications that run during system initialisation or on embedded ARM hardware with no good sources of randomness, then there's zero chance of any change made in relation to this affecting you (Windows and Mac OS X are completely immune, since they don't allow Python scripts to run early enough in the boot sequence for there to ever be a problem).
Understood. I could quite happily ignore this thread for all the impact it will have on me. However, I've seen enough of these debates (and witnessed the frustration of the security advocates) that I want to try to understand the issues better - as much as anything so that I don't end up adding uninformed opposition to these threads (in my day job, unfortunately, security is generally the excuse for all sorts of counter-productive rules, and never offers any practical benefits that I am aware of, so I'm predisposed to rejecting arguments based on security - that background isn't accurate in this environment and I'm actively trying to counter it).
The only question at hand is what CPython should do in the case where the operating system *does* let Python scripts run before the system random number generator is ready, and the application calls a security sensitive API that relies on that RNG:
- throw BlockingIOError (so the script developer knows they have a potential problem to fix) - block (so the script developer has a system hang to debug) - return low quality random data (so the script developer doesn't even know they have a potential problem)
The last option is the status quo, and has a remarkable number of vocal defenders.
Understood. It seems to me that there are two arguments here - backward compatibility (which is always a pressure, but sometimes applied too vigourously and not always consistently) and "we've always done it that way" (aka "people will have to consider what happens when they run under 3.4 anyway, so how will changing help?"). Jusging backward compatibility is always a matter of trade-offs, hence my interest in the actual benefits.
The second option is what we changed the behaviour to in 3.5 as a side effect of switching to a syscall to save a file descriptor (and *also* inadvertently made a gating requirement for CPython starting at all, without which I'd be very surprised if anyone actually noticed the potentially blocking behaviour in os.urandom itself)
OK, so (given that the issue of CPython starting at all was an accidental, and now corrected, side effect) why is this so bad? Maybe not in a minor release, but at least for 3.6? How come this has caused such a fuss? I genuinely don't understand why people see blocking as such an issue (and as far as I can tell, Ted Tso seems to agree). The one case where this had an impact was a quickly fixed bug - so as far as I can tell, the risk of problems caused by blocking is purely hypothetical.
The first option is the one I'm currently writing a PEP for, since it makes the longstanding advice to use os.urandom() as the low level random data API for security sensitive operations unequivocally correct (as it will either do the right thing, or throw an exception which the developer can handle as appropriate for their particular application)
In my code, I typically prefer Python to make detailed decisions for me (e.g. requests follows redirects by default, it doesn't expect me to do so manually). Now certainly this is a low-level interface so the rules are different, but I don't see why blocking by default isn't "unequivocally correct" in the same way that it is on other platforms, rather than raising an exception and requiring the developer to do the wait manually. (What else would they do - fall back to insecure data? I thought the point here was that that's the wrong thing to do?) Having a blocking default with a non-blocking version seems just as arguable, and has the advantage that naive users (I don't even know if we're allowing for naive users here) won't get an unexpected exception and handle it badly because they don't know what to do (a sadly common practice in my experience). OK. Guido has pronounced, you're writing a PEP. None of this debate is really constructive any more. But I still don't understand the trade-offs, which frustrates me. Surely security isn't so hard that it can't be explained in a way that an interested layman like myself can follow? :-( Paul