[Python-Dev] BDFL ruling request: should we block forever waiting for high-quality random bits?

Nick Coghlan ncoghlan at gmail.com
Thu Jun 9 13:53:59 EDT 2016


On 9 June 2016 at 04:25, Larry Hastings <larry at hastings.org> wrote:
> A user reports that when starting CPython soon after startup on a fresh
> virtual machine, the process would hang for a long time.  Someone on the
> issue reported observed delays of over 90 seconds.  Later we found out: it
> wasn't 90 seconds before CPython became usable, these 90 seconds delays were
> before systemd timed out and simply killed the process.  It's not clear what
> the upper bound on the delay might be.
>
> The issue author had already identified the cause: CPython was blocking on
> getrandom() in order to initialize hash randomization.  On this fresh
> virtual machine the entropy pool started out uninitialized.  And since the
> only thing running on the machine was CPython, and since CPython was blocked
> on initialization, the entropy pool was initializing very, very slowly.

Further analysis (mentioned later in the original Python-3.5-on-Linux
bug report) suggested that this wasn't actually a generic "waiting for
the entropy pool to initialise" problem. Instead, the problem appeared
to be specifically that the Python script was being invoked *before
the Linux kernel had initialised the entropy pool* and the boot
process was waiting for that script to run before continuing on with
other tasks (like initialising the entropy pool). That meant
os.urandom() had nothing to do with it (since the affected script
wasn't generating random numbers), and the entire problem was that we
were blocking trying to initialise CPython's internal hashing.

Born from Victor's proposal to add a "wait for entropy?" flag to
os.urandom [1], the simplest proposal for a long term fix [2] posted
so far has been to:

1. make os.urandom raise BlockingIOError if kernel entropy is not available
2. don't rely on os.urandom for internal hash initialisation
3. don't rely on os.urandom for MT seeding in the random module

Linux is currently the only OS we know of where the BlockingIOError
would be a possible result, and the only known scenarios where it
could be raised are Linux init system scripts and some embedded
systems where the kernel doesn't have any good sources of entropy. In
both those cases, the lack of entropy is potentially a real problem,
and an exception lets the software author make an informed decision to
either wait for entropy (e.g. by polling os.urandom() until it
succeeds, or selecting on /dev/random) or else read directly from
/dev/urandom (potentially getting non-cryptographically secure bits)

The virtue of this approach is that it's entirely invisible for almost
all users, and the users that it does affect will start getting an
exception in Python 3.6+ rather than silently being handed
cryptographically non-secure random data.

Cheers,
Nick.

[1] http://bugs.python.org/issue27266
[2] http://bugs.python.org/issue27282


-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


More information about the Python-Dev mailing list