Security-SIG
Threads by month
- ----- 2025 -----
- February
- January
- ----- 2024 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2023 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2022 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2021 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2020 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2019 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2018 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2017 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2016 -----
- December
- November
- October
- September
- August
- July
- June
- 57 discussions

June 25, 2016
Hi folks,
Working on an update to PEP 522, I realised while poking around in
sysconfig for the HAVE_GETRANDOM_SYSCALL flag that only checking for
whether or not the syscall had been available at buildtime would be
potentially problematic - it means that a Python built against a newer
Linux kernel (e.g. Ubuntu 16.04, Fedora 24) may do the wrong thing
when run on an older kernel that hasn't had the new syscall backported
(e.g. Ubuntu 14.04, RHEL 7.2, CentOS 7.1511). That's something that
can easily happen with containers, or any other case of bundling the
language runtime with the application executable.
The actual code behind os.urandom already deals with this case
correctly (see the ENOSYS reference in py_getrandom at [1]), but it
means there really is no way for pure Python code running against an
older kernel to tell whether a successful os.urandom() call was
because the system RNG was ready or because the kernel is old.
So regardless of whether we go with the blocking-by-default or
raise-BlockingIOError strategy, we should also define what we want the
interpreter to do in the ENOSYS case (for PEP 522, I wanted to warn
about it in the new secrets.wait_for_system_rng() function, but at
least for now I'm going to settle for letting the SipHash
initialisation warn about it, the same way it would for a lack of
entropy)
This is actually the best argument I've seen so far for exposing
os.getrandom() directly: unlike os.urandom(), we could allow a new
os.getrandom() API to raise NotImplementedError if the running kernel
doesn't provide the getrandom() syscall.
Having such an API available would then let
secrets.wait_for_system_rng() more reliably check whether or not the
system RNG was ready before falling back on a potentially blocking
read of /dev/random.
Cheers,
Nick.
[1] https://hg.python.org/cpython/file/default/Python/random.c#l119
--
Nick Coghlan | ncoghlan(a)gmail.com | Brisbane, Australia
1
0
I'm not sure that anyone got my email since I sent it like 1 hour
after Ethan announced the creation of the list, so I repost my email
;-)
--
Hi,
Warning: believe me or not, I only read the first ~50 messages of the
recent discussion about random on the Python bug tracker and then the
python-dev mailing list.
Warning 2: If this email thread gets 100 emails per day as it was the
case on the bug tracker and python-dev, I will have to ignore it
again. Sorry, but I don't have the bandwith to read so much messages
:-(
Here is a concrete proposal trying to make Python 3.6 more secure on
Linux, without blocking Python at startup.
I suggest to stick to Linux first. Sorry, but I don't have the skills
to propose a concrete change for other platforms since I don't know
well their exact behaviour, and I'm not that they give access to
blocking *and* non-blocking urandom.
Victor
HTML version:
https://haypo-notes.readthedocs.io/pep_random.html
+++++++++++++++++++++++++++++++++++
Make os.urandom() blocking on Linux
+++++++++++++++++++++++++++++++++++
Headers::
PEP: xxx
Title: Make os.urandom() blocking on Linux
Version: $Revision$
Last-Modified: $Date$
Author: Victor Stinner <victor.stinner(a)gmail.com>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 20-June-2016
Python-Version: 3.6
Abstract
========
Modify ``os.urandom()`` to block on Linux 3.17 and newer until the OS
urandom is initialized.
Rationale
=========
Linux 3.17 adds a new ``getrandom()`` syscall which allows to block
until the kernel collected enough entropy. It avoids to generate weak
cryptographic keys.
Python os.urandom() uses the ``getrandom()``, but falls back on reading
the non-blocking ``/dev/urandom`` if ``getrandom(GRND_NONBLOCK)`` fails
with ``EAGAIN``.
Security experts promotes ``os.urandom()`` to genereate cryptographic
keys, even instead of ``ssl.RAND_bytes()``.
Python 3.5.0 blocked at startup on virtual machines, waiting for the OS
urandom initialization, which was seen as a regression compared to
Python 3.4 by users.
This PEP proposes to modify os.urandom() to more is more secure, but
also ensure that Python will not block at startup.
Changes
=======
* Initialize hash secret from non-blocking OS urandom
* Initialize random._inst, a Random instance, with non-blocking OS
urandom
* Modify os.urandom() to block until urandom is initialized on Linux
A new _PyOS_URandom_Nonblocking() private method will be added: read OS
urandom in non-blocking mode. In practice, it means that it falls back
on reading /dev/urandom on Linux.
_PyRandom_Init() is modified to call _PyOS_URandom_Nonblocking().
Moreover, a new ``random_inst_seed`` will be added to the
``_Py_HashSecret_t`` structure (see above).
random._inst will be initialized with the ``random_inst_seed`` secret. A
flag will be used to ensure that this secret is only used once.
If a second instance of random.Random is created, blocking os.urandom()
will be used.
Alternative
===========
Never use blocking urandom in the random module
-----------------------------------------------
The random module can use ``random_inst_seed`` as a seed, but add other
sources of entropy like the process identifier (``os.getpid()``), the
current time (``time.time()``), memory addresses, etc.
Reading 2500 bytes from os.urandom() to initialize the Mersenne Twister
RNG in random.Random is a deliberate choice to get access to the full
range of the RNG. This PEP is a compromise between "security" and
"feature". Python should not block at startup before the OS collected
enough entropy. But on the regular use case (OS urandom iniitalized),
the random module should continue to its code to initialize the seed.
Python 3.5.0 was blocked on ``import random``, not on building a second
instance of ``random.Random``.
Annexes
=======
Why using os.urandom()?
-----------------------
Since ``os.urandom()`` is implemented in the kernel, it doesn't have
some issues of user-space RNG. For example, it is much harder to get its
state. It is usually built on a CSPRNG, so even if its state is get, it
is hard to compute previously generated numbers. The kernel has a good
knowledge of entropy sources and feed regulary the entropy pool.
Linux getrandom()
-----------------
On OpenBSD, FreeBSD and Mac OS X, reading /dev/urandom blocks until the
kernel collected enough entropy. It is not the case on Linux. Basically,
if a design choice should be make between usability and security,
usability is preferred on Linux, whereas security is preferred on BSD.
The new ``getrandom()`` of Linux 3.17 allows users to choose security be
blocking until the kernel collected enough entropy.
On virtual machines and some embedded devices, it can take longer than a
minute to collect enough entropy. In the worst case, the application
will block forever because the kernel really has no entropy source and
so cannot unblock ``getrandom()``.
Copyright
=========
This document has been placed in the public domain.
6
22

June 25, 2016
Hi folks,
Over the weekend, Nathaniel Smith and I put together a proposal to
allow security sensitive APIs (os.urandom, random.SystemRandom and the
new secrets module) to throw BlockingIOError if the operating system's
random number generator isn't ready.
We think this approach provides all the desired security guarantees,
while being relatively straightforward for affected system integrators
to diagnose and appropriately resolve if they're currently using these
APIs in a context where Linux is currently feeding them potentially
predictable random values.
Rendered: https://www.python.org/dev/peps/pep-0522/
GitHub: https://github.com/python/peps/blob/master/pep-0522.txt
The "Additional Background" section is mainly for the sake of folks
that haven't been following any of the previous discussions, but also
provides the reasoning for why we don't consider retaining consistency
with "man urandom" to be a useful design goal (any more than the
builtin open tries to retain consistency with "man open")
Cheers,
Nick.
=================
PEP: 522
Title: Allow BlockingIOError in security sensitive APIs on Linux
Version: $Revision$
Last-Modified: $Date$
Author: Nick Coghlan <ncoghlan(a)gmail.com>, Nathaniel J. Smith <njs(a)pobox.com>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 16 June 2016
Python-Version: 3.6
Abstract
========
A number of APIs in the standard library that return random values nominally
suitable for use in security sensitive operations currently have an obscure
Linux-specific failure mode that allows them to return values that are not,
in fact, suitable for such operations.
This PEP proposes changing such failures in Python 3.6 from the current silent,
hard to detect, and hard to debug, errors to easily detected and debugged errors
by raising ``BlockingIOError`` with a suitable error message, allowing
developers the opportunity to unambiguously specify their preferred approach
for handling the situation.
The APIs affected by this change would be:
* ``os.urandom``
* ``random.SystemRandom``
* the new ``secrets`` module added by PEP 506
The new exception would potentially be encountered in the following situations:
* Python code calling these APIs during Linux system initialization
* Python code running on improperly initialized Linux systems (e.g. embedded
hardware without adequate sources of entropy to seed the system random number
generator, or Linux VMs that aren't configured to accept entropy from the
VM host)
CPython interpreter initialization and ``random`` module initialization would
also be updated to gracefully fall back to alternative seeding options if the
system random number generator is not ready.
Proposal
========
Changing ``os.urandom()`` on Linux
----------------------------------
This PEP proposes that in Python 3.6+, ``os.urandom()`` be updated to call
the new Linux ``getrandom()`` syscall in non-blocking mode if available and
raise ``BlockingIOError: system random number generator is not ready`` if
the kernel reports that the call would block.
This behaviour will then
propagate through to higher level standard library APIs that depend on
``os.urandom`` (specifically ``random.SystemRandom`` and the new ``secrets``
module introduced by PEP 506).
In all cases, as soon as a call to one of these security sensitive APIs
succeeds, all future calls to these APIs in that process will succeed (once
the operating system random number generator is ready after system boot, it
remains ready).
Related changes
---------------
Currently, SipHash initialization and ``random`` module initialization
both gather random bytes using the same code that underlies
``os.urandom``. This PEP proposes to modify these so that in situations where
``os.urandom`` would raise a ``BlockingIOError``, they automatically
fall back on potentially more predictable sources of randomness (and in the
SipHash case, print a warning message to ``stderr`` indicating that that
particular Python process should not be used to process untrusted data).
To transparently accommodate a potential future where Linux adopts the same
"potentially blocking during system initialization" ``/dev/urandom`` behaviour
used by other \*nix systems, this fallback source of randomness will *not* be
the ``/dev/urandom`` device.
Limitations on scope
--------------------
No changes are proposed for Windows or Mac OS X systems, as neither of those
platforms provides any mechanism to run Python code before the operating
system random number generator has been initialized. Mac OS X goes so far as
to kernel panic and abort the boot process if it can't properly initialize the
random number generator (although Apple's restrictions on the supported
hardware platforms make that exceedingly unlikely in practice).
Similarly, no changes are proposed for other \*nix systems where
``os.urandom()`` will currently block waiting for the system random number
generator to be initialized, rather than returning values that are potentially
unsuitable for use in security sensitive applications.
While other \*nix systems that offer a non-blocking API for requesting random
numbers suitable for use in security sensitive applications could potentially
receive a similar update to the one proposed for Linux in this PEP, such
changes are out of scope for this particular proposal.
Python's behaviour on older Linux systems that do not offer the new
``getrandom()`` syscall will also remain unchanged.
Rationale
=========
Raising ``BlockingIOError`` in ``os.urandom()`` on Linux
--------------------------------------------------------
For several years now, the security community's guidance has been to use
``os.urandom()`` (or the ``random.SystemRandom()`` wrapper) when implementing
security sensitive operations in Python.
To help improve API discoverability and make it clearer that secrecy and
simulation are not the same problem (even though they both involve
random numbers), PEP 506 collected several of the one line recipes based
on the lower level ``os.urandom()`` API into a new ``secrets`` module.
However, this guidance has also come with a longstanding caveat: developers
writing security sensitive software at least for Linux, and potentially for
some other \*BSD systems, may need to wait until the operating system's
random number generator is ready before relying on it for security sensitive
operations. This generally only occurs if ``os.urandom()`` is read very
early in the system initialization process, or on systems with few sources of
available entropy (e.g. some kinds of virtualized or embedded systems), but
unfortunately the exact conditions that trigger this are difficult to predict,
and when it occurs then there is no direct way for userspace to tell it has
happened without querying operating system specific interfaces.
On \*BSD systems (if the particular \*BSD variant allows the problem to occur
at all), encountering this situation means ``os.urandom()`` will either block
waiting for the system random number generator to be ready (the associated
symptom would be for the affected script to pause unexpectedly on the first
call to ``os.urandom()``) or else will behave the same way as it does on Linux.
On Linux, in Python versions up to and including Python 3.4, and in
Python 3.5 maintenance versions following Python 3.5.2, there's no clear
indicator to developers that their software may not be working as expected
when run early in the Linux boot process, or on hardware without good
sources of entropy to seed the operating system's random number generator: due
to the behaviour of the underlying ``/dev/urandom`` device, ``os.urandom()``
on Linux returns a result either way, and it takes extensive statistical
analysis to show that a security vulnerability exists.
By contrast, if ``BlockingIOError`` is raised in those situations, then
developers using Python 3.6+ can easily choose their desired behaviour:
1. Loop until the call succeeds (security sensitive)
2. Switch to using the random module (non-security sensitive)
3. Switch to reading ``/dev/urandom`` directly (non-security sensitive)
Issuing a warning for potentially predictable internal hash initialization
--------------------------------------------------------------------------
The challenge for internal hash initialization is that it might be very
important to initialize SipHash with a reliably unpredictable random seed
(for processes that are exposed to potentially hostile input) or it might be
totally unimportant (for processes that never have to deal with untrusted data).
The Python runtime has no way to know which case a given invocation involves,
which means that if we allow SipHash initialization to block or error out,
then our intended security enhancement may break code that is already safe
and working fine, which is unacceptable -- especially since we are reasonably
confident that most Python invocations that might run during Linux system
initialization fall into this category (exposure to untrusted input tends to
involve network access, which typically isn't brought up until after the system
random number generator is initialized).
However, at the same time, since Python has no way to know whether any given
invocation needs to handle untrusted data, when the default SipHash
initialization fails this *might* indicate a genuine security problem, which
should not be allowed to pass silently.
Accordingly, if internal hash initialization needs to fall back to a potentially
predictable seed due to the system random number generator not being ready, it
will also emit a warning message on ``stderr`` to say that the system random
number generator is not available and that processing potentially hostile
untrusted data should be avoided.
Allowing potentially predictable ``random`` module initialization
-----------------------------------------------------------------
Other than for ``random.SystemRandom`` (which is a relatively thin
wrapper around ``os.urandom``), the ``random`` module has never made
any guarantees that the numbers it generates are suitable for use in
security sensitive operations, so the use of the system random number
generator to seed the default Mersenne Twister instance is mainly beneficial
as a harm mitigation measure for code that is using the ``random`` module
inappropriately.
Since a single call to ``os.urandom()`` is cheap once the system random
number generator has been initialized it makes sense to retain that as the
default behaviour, but there's no need to issue a warning when falling back to
a potentially more predictable alternative when necessary (in such cases,
a warning will typically already have been issued as part of interpreter
startup, as the only way for the call when importing the random module to
fail without the implicit call during interpreter startup also failing if for
the latter to have been skipped by entirely disabling the hash randomization
mechanism).
Backwards Compatibility Impact Assessment
=========================================
Similar to PEP 476, this is a proposal to turn a previously silent security
failure into a noisy exception that requires the application developer to
make an explicit decision regarding the behaviour they desire.
As no changes are proposed for operating systems other than Linux,
``os.urandom()`` retains its existing behaviour as a nominally blocking API
that is non-blocking in practice due to the difficulty of scheduling Python
code to run before the operating system random number generator is ready. We
believe it may be possible to encounter problems akin to those described in
this PEP on at least some \*BSD variants, but nobody has explicitly
demonstrated that. On Mac OS X and Windows, it appears to be straight up
impossible to even try to run a Python interpreter that early in the boot
process.
On Linux, ``os.urandom()`` retains its status as a guaranteed non-blocking API.
However, the means of achieving that status changes in the specific case of
the operating system random number generator not being ready for use in security
sensitive operations: historically it would return potentially predictable
random data, with this PEP it would change to raise ``BlockingIOError``.
Developers of affected applications would then be required to make one of the
following changes to gain forward compatibility with Python 3.6, based on the
kind of application they're developing.
Unaffected Applications
-----------------------
The following kinds of applications would be entirely unaffected by the change,
regardless of whether or not they perform security sensitive operations:
- applications that don't support Linux
- applications that are only run on desktops or conventional servers
- applications that are only run after the system RNG is ready
Applications in this category simply won't encounter the new exception, so it
will be reasonable for developers to wait and see if they receive
Python 3.6 compatibility bugs related to the new runtime behaviour, rather than
attempting to pre-emptively determine whether or not they're affected.
Affected security sensitive applications
----------------------------------------
Security sensitive applications would need to either change their system
configuration so the application is only started after the operating system
random number generator is ready for security sensitive operations, or else
change their code to busy loop until the operating system is ready::
def blocking_urandom(num_bytes):
while True:
try:
return os.urandom(num_bytes)
except BlockingIOError:
pass
Affected non-security sensitive applications
--------------------------------------------
Non-security sensitive applications that don't want to assume access to
``/dev/urandom`` (or assume a non-blocking implementation of that device)
can be updated to use the ``random`` module as a fallback option::
def pseudorandom_fallback(num_bytes):
try:
return os.urandom(num_bytes)
except BlockingIOError:
random.getrandbits(num_bytes*8).to_bytes(num_bytes, "little")
Depending on the application, it may also be appropriate to skip accessing
``os.urandom`` at all, and instead rely solely on the ``random`` module.
Affected Linux specific non-security sensitive applications
-----------------------------------------------------------
Non-security sensitive applications that don't need to worry about cross
platform compatibility and are willing to assume that ``/dev/urandom`` on
Linux will always retain its current behaviour can be updated to access
``/dev/urandom`` directly::
def dev_urandom(num_bytes):
with open("/dev/urandom", "rb") as f:
return f.read(num_bytes)
However, pursuing this option has the downside of contributing to ensuring
that the default behaviour of Linux at the operating system level can never
be changed.
Additional Background
=====================
Why propose this now?
---------------------
The main reason is because the Python 3.5.0 release switched to using the new
Linux ``getrandom()`` syscall when available in order to avoid consuming a
file descriptor [1]_, and this had the side effect of making the following
operations block waiting for the system random number generator to be ready:
* ``os.urandom`` (and APIs that depend on it)
* importing the ``random`` module
* initializing the randomized hash algorithm used by some builtin types
While the first of those behaviours is arguably desirable (and consistent with
``os.urandom``'s existing behaviour on other operating systems), the latter two
behaviours are unnecessary and undesirable, and the last one is now known to
cause a system level deadlock when attempting to run Python scripts during the
Linux init process with Python 3.5.0 or 3.5.1 [2]_, while the second one can
cause problems when using virtual machines without robust entropy sources
configured [3]_.
Since decoupling these behaviours in CPython will involve a number of
implementation changes more appropriate for a feature release than a maintenance
release, the relatively simple resolution applied in Python 3.5.2 was to revert
all three of them to a behaviour similar to that of previous Python versions:
if the new Linux syscall indicates it will block, then Python 3.5.2 will
implicitly fall back on reading ``/dev/urandom`` directly [4]_.
However, this bug report *also* resulted in a range of proposals to add *new*
APIs like ``os.getrandom()`` [5]_, ``os.urandom_block()`` [6]_,
``os.pseudorandom()`` and ``os.cryptorandom()`` [7]_, or adding new optional
parameters to ``os.urandom()`` itself [8]_, and then attempting to educate
users on when they should call those APIs instead of just using a plain
``os.urandom()`` call.
These proposals represent dramatic overreactions, as the question of reliably
obtaining random numbers suitable for security sensitive work on Linux is a
relatively obscure problem of interest mainly to operating system developers
and embedded systems programmers, that in no way justifies cluttering up the
Python standard library's cross-platform APIs with new Linux-specific concerns.
This is especially so with the ``secrets`` module already being added as the
"use this and don't worry about the low level details" option for developers
writing security sensitive software that for some reason can't rely on even
higher level domain specific APIs (like web frameworks) and also don't need to
worry about Python versions prior to Python 3.6.
That said, it's also the case that low cost ARM devices are becoming
increasingly prevalent, with a lot of them running Linux, and a lot of folks
writing Python applications that run on those devices. That creates an
opportunity to take an obscure security problem that currently requires a lot
of knowledge about Linux boot processes and provably unpredictable random
number generation to diagnose and resolve, and instead turn it into a
relatively mundane and easy-to-find-in-an-internet-search runtime exception.
The cross-platform behaviour of ``os.urandom()``
------------------------------------------------
On operating systems other than Linux, ``os.urandom()`` may already block
waiting for the operating system's random number generator to be ready. This
will happen at most once in the lifetime of the process, and the call is
subsequently guaranteed to be non-blocking.
Linux is unique in that, even when the operating system's random number
generator doesn't consider itself ready for use in security sensitive
operations, reading from the ``/dev/urandom`` device will return random values
based on the entropy it has available.
This behaviour is potentially problematic, so Linux 3.17 added a new
``getrandom()`` syscall that (amongst other benefits) allows callers to
either block waiting for the random number generator to be ready, or
else request an error return if the random number generator is not ready.
Notably, the new API does *not* support the old behaviour of returning
data that is not suitable for security sensitive use cases.
Versions of Python prior up to and including Python 3.4 access the
Linux ``/dev/urandom`` device directly.
Python 3.5.0 and 3.5.1 called ``getrandom()`` in blocking mode in order to
avoid the use of a file descriptor to access ``/dev/urandom``. While there
were no specific problems reported due to ``os.urandom()`` blocking in user
code, there *were* problems due to CPython implicitly invoking the blocking
behaviour during interpreter startup and when importing the ``random`` module.
Rather than trying to decouple SipHash initialization from the
``os.urandom()`` implementation, Python 3.5.2 switched to calling
``getrandom()`` in non-blocking mode, and falling back to reading from
``/dev/urandom`` if the syscall indicates it will block.
As a result of the above, ``os.urandom()`` in all Python versions up to and
including Python 3.5 propagate the behaviour of the underling ``/dev/urandom``
device to Python code.
Problems with the behaviour of ``/dev/urandom`` on Linux
--------------------------------------------------------
The Python ``os`` module has largely co-evolved with Linux APIs, so having
``os`` module functions closely follow the behaviour of their Linux operating
system level counterparts when running on Linux is typically considered to be
a desirable feature.
However, ``/dev/urandom`` represents a case where the current behaviour is
acknowledged to be problematic, but fixing it unilaterally at the kernel level
has been shown to prevent some Linux distributions from booting (at least in
part due to components like Python currently using it for
non-security-sensitive purposes early in the system initialization process).
As an analogy, consider the following two functions::
def generate_example_password():
"""Generates passwords solely for use in code examples"""
return generate_unpredictable_password()
def generate_actual_password():
"""Generates actual passwords for use in real applications"""
return generate_unpredictable_password()
If you think of an operating system's random number generator as a method for
generating unpredictable, secret passwords, then you can think of Linux's
``/dev/urandom`` as being implemented like::
# Oversimplified artist's conception of the kernel code
# implementing /dev/urandom
def generate_unpredictable_password():
if system_rng_is_ready:
return use_system_rng_to_generate_password()
else:
# we can't make an unpredictable password; silently return a
# potentially predictable one instead:
return "p4ssw0rd"
In this scenario, the author of ``generate_example_password`` is fine - even if
``"p4ssw0rd"`` shows up a bit more often than they expect, it's only used in
examples anyway. However, the author of ``generate_actual_password`` has a
problem - how do they prove that their calls to
``generate_unpredictable_password`` never follow the path that returns a
predictable answer?
In real life it's slightly more complicated than this, because there
might be some level of system entropy available -- so the fallback might
be more like ``return random.choice(["p4ssword", "passw0rd",
"p4ssw0rd"])`` or something even more variable and hence only statistically
predictable with better odds than the author of ``generate_actual_password``
was expecting. This doesn't really make things more provably secure, though;
mostly it just means that if you try to catch the problem in the obvious way --
``if returned_password == "p4ssw0rd": raise UhOh`` -- then it doesn't work,
because ``returned_password`` might instead be ``p4ssword`` or even
``pa55word``, or just an arbitrary 64 bit sequence selected from fewer than
2**64 possibilities. So this rough sketch does give the right general idea of
the consequences of the "more predictable than expected" fallback behaviour,
even though it's thoroughly unfair to the Linux kernel team's efforts to
mitigate the practical consequences of this problem without resorting to
breaking backwards compatibility.
This design is generally agreed to be a bad idea. As far as we can
tell, there are no use cases whatsoever in which this is the behavior
you actually want. It has led to the use of insecure ``ssh`` keys on
real systems, and many \*nix-like systems (including at least Mac OS
X, OpenBSD, and FreeBSD) have modified their ``/dev/urandom``
implementations so that they never return predictable outputs, either
by making reads block in this case, or by simply refusing to run any
userspace programs until the system RNG has been
initialized. Unfortunately, Linux has so far been unable to follow
suit, because it's been empirically determined that enabling the
blocking behavior causes some currently extant distributions to
fail to boot.
Instead, the new ``getrandom()`` syscall was introduced, making
it *possible* for userspace applications to access the system random number
generator safely, without introducing hard to debug deadlock problems into
the system initialization processes of existing Linux distros.
Consequences of ``getrandom()`` availability for Python
-------------------------------------------------------
Prior to the introduction of the ``getrandom()`` syscall, it simply wasn't
feasible to access the Linux system random number generator in a provably
safe way, so we were forced to settle for reading from ``/dev/urandom`` as the
best available option. However, with ``getrandom()`` insisting on raising an
error or blocking rather than returning predictable data, as well as having
other advantages, it is now the recommended method for accessing the kernel
RNG on Linux, with reading ``/dev/urandom`` directly relegated to "legacy"
status. This moves Linux into the same category as other operating systems
like Windows, which doesn't provide a ``/dev/urandom`` device at all: the
best available option for implementing ``os.urandom()`` is no longer simply
reading bytes from the ``/dev/urandom`` device.
This means that what used to be somebody else's problem (the Linux kernel
development team's) is now Python's problem -- given a way to detect that the
system RNG is not initialized, we have to choose how to handle this
situation whenever we try to use the system RNG.
It could simply block, as was somewhat inadvertently implemented in 3.5.0::
# artist's impression of the CPython 3.5.0-3.5.1 behavior
def generate_unpredictable_bytes_or_block(num_bytes):
while not system_rng_is_ready:
wait
return unpredictable_bytes(num_bytes)
Or it could raise an error, as this PEP proposes (in *some* cases)::
# artist's impression of the behavior proposed in this PEP
def generate_unpredictable_bytes_or_raise(num_bytes):
if system_rng_is_ready:
return unpredictable_bytes(num_bytes)
else:
raise BlockingIOError
Or it could explicitly emulate the ``/dev/urandom`` fallback behavior,
as was implemented in 3.5.2rc1 and is expected to remain for the rest
of the 3.5.x cycle::
# artist's impression of the CPython 3.5.2rc1+ behavior
def generate_unpredictable_bytes_or_maybe_not(num_bytes):
if system_rng_is_ready:
return unpredictable_bytes(num_bytes)
else:
return (b"p4ssw0rd" * (num_bytes // 8 + 1))[:num_bytes]
(And the same caveats apply to this sketch as applied to the
``generate_unpredictable_password`` sketch of ``/dev/urandom`` above.)
There are five places where CPython and the standard library attempt to use the
operating system's random number generator, and thus five places where this
decision has to be made:
* initializing the SipHash used to protect ``str.__hash__`` and
friends against DoS attacks (called unconditionally at startup)
* initializing the ``random`` module (called when ``random`` is
imported)
* servicing user calls to the ``os.urandom`` public API
* the higher level ``random.SystemRandom`` public API
* the new ``secrets`` module public API added by PEP 506
Currently, these five places all use the same underlying code, and
thus make this decision in the same way.
This whole problem was first noticed because 3.5.0 switched that
underlying code to the ``generate_unpredictable_bytes_or_block`` behavior,
and it turns out that there are some rare cases where Linux boot
scripts attempted to run a Python program as part of system initialization, the
Python startup sequence blocked while trying to initialize SipHash,
and then this triggered a deadlock because the system stopped doing
anything -- including gathering new entropy -- until the Python script
was forcibly terminated by an external timer. This is particularly unfortunate
since the scripts in question never processed untrusted input, so there was no
need for SipHash to be initialized with provably unpredictable random data in
the first place. This motivated the change in 3.5.2rc1 to emulate the old
``/dev/urandom`` behavior in all cases (by calling ``getrandom()`` in
non-blocking mode, and then falling back to reading ``/dev/urandom``
if the syscall indicates that the ``/dev/urandom`` pool is not yet
fully initialized.)
A similar problem was found due to the ``random`` module calling
``os.urandom`` as a side-effect of import in order to seed the default
global ``random.Random()`` instance.
We have not received any specific complaints regarding direct calls to
``os.urandom()`` or ``random.SystemRandom()`` blocking with 3.5.0 or 3.5.1 -
only problem reports due to the implicit blocking on interpreter startup and
as a side-effect of importing the random module.
Accordingly, this PEP proposes providing consistent shared behaviour for the
latter three cases (ensuring that their behaviour is unequivocally suitable for
all security sensitive operations), while updating the first two cases to
account for that behavioural change.
This approach should mean that the vast majority of Python users never need to
even be aware that this change was made, while those few whom it affects will
receive an exception at runtime that they can look up online and find suitable
guidance on addressing.
References
==========
.. [1] os.urandom() should use Linux 3.17 getrandom() syscall
(http://bugs.python.org/issue22181)
.. [2] Python 3.5 running on Linux kernel 3.17+ can block at startup or on
importing the random module on getrandom()
(http://bugs.python.org/issue26839)
.. [3] "import random" blocks on entropy collection on Linux with low entropy
(http://bugs.python.org/issue25420)
.. [4] os.urandom() doesn't block on Linux anymore
(https://hg.python.org/cpython/rev/9de508dc4837)
.. [5] Proposal to add os.getrandom()
(http://bugs.python.org/issue26839#msg267803)
.. [6] Add os.urandom_block()
(http://bugs.python.org/issue27250)
.. [7] Add random.cryptorandom() and random.pseudorandom, deprecate os.urandom()
(http://bugs.python.org/issue27279)
.. [8] Always use getrandom() in os.random() on Linux and add
block=False parameter to os.urandom()
(http://bugs.python.org/issue27266)
For additional background details beyond those captured in this PEP, also see
Victor Stinner's summary at http://haypo-notes.readthedocs.io/pep_random.html
Copyright
=========
This document has been placed into the public domain.
--
Nick Coghlan | ncoghlan(a)gmail.com | Brisbane, Australia
6
14
Hi,
I completed my PEP. Here is a second version of my PEP. Changes:
* I added new sections:
- The bug
- Use Cases
- Fix system urandom
- Denial-of-service when reading random
* I added alternatives:
- Leave os.urandom() unchanged, add os.getrandom()
- Raise BlockingIOError in os.urandom()
- Add an optional block parameter to os.urandom()
I added 3 sections to try to describe the context of "the bug". For
example, I think that it's important to mention that all operating
systems loads entropy from the disk at the boot.
For me, the last tricky question is the use case 2 (run a web server)
on a VM or embedded when system urandom is not initialized yet and
there is no entropy on disk yet (ex: first boot, or maybe second boot,
of a VM).
I read quickly that a VM connected to a network should be able to
quickly initialized the system urandom. So I'm not sure that the use
case 2 (web server) is really an issue in practice.
Victor
HTML version:
https://haypo-notes.readthedocs.io/pep_random.html
++++++++++++++++++++++++++++++++++++++++
PEP: Make os.urandom() blocking on Linux
++++++++++++++++++++++++++++++++++++++++
Headers::
PEP: xxx
Title: Make os.urandom() blocking on Linux
Version: $Revision$
Last-Modified: $Date$
Author: Victor Stinner <victor.stinner(a)gmail.com>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 20-June-2016
Python-Version: 3.6
Abstract
========
Modify ``os.urandom()`` to block on Linux 3.17 and newer until the OS
urandom is initialized.
The bug
=======
Python 3.5.0 was enhanced to use the new ``getrandom()`` syscall
introduced in Linux 3.17 and Solaris 11.3. The problem is that users
started to complain that Python 3.5 blocks at startup on Linux in
virtual machines and embedded devices: see issues `#25420
<http://bugs.python.org/issue25420>`_ and `#26839
<http://bugs.python.org/issue26839>`_.
On Linux, ``getrandom(0)`` blocks until the kernel initialized urandom
with 128 bits of entropy. The issue #25420 describes a Linux build
platform blocking at ``import random``. The issue #26839 describes a
short Python script used to compute a MD5 hash, systemd-cron, script
called very early in the init process. The system initialization blocks
on this script which blocks on ``getrandom(0)`` to initialize Python.
The Python initilization requires random bytes to implement a
counter-measure against the hash denial-of-service (hash DoS), see:
* `Issue #13703: Hash collision security issue
<http://bugs.python.org/issue13703>`_
* `PEP 456: Secure and interchangeable hash algorithm
<https://www.python.org/dev/peps/pep-0456/>`_
Importing the ``random`` module creates an instance of
``random.Random``: ``random._inst``. On Python 3.5, random.Random
constructor reads 2500 bytes from ``os.urandom()`` to seed a Mersenne
Twister RNG (random number generator).
Other platforms may be affected by this bug, but in practice, only Linux
systems use Python scripts to initialize the system.
Use Cases
=========
The following use cases are used to help to choose the right compromise
between security and practicability.
Use Case 1: init script
-----------------------
Use a Python 3 script to initialize the system, like systemd-cron. If
the script blocks, the system initialize is stuck too.
The issue #26839 is a good example of this use case.
Use Case 2: web server
----------------------
Run a Python 3 web server serving web pages using HTTP and HTTPS
protocols. The server is started as soon as possible.
The first target of the hash DoS attack was web server: it's important
that the hash secret cannot be easily guessed by an attacker.
If serving a web page needs a secret to create a cookie, create an
encryption key, ..., the secret must be created with good entropy:
again, it must be hard to guess the secret.
A web server requires security. If a choice must be made between
security and running the server with weak entropy, security is more
important. If there is no good entropy: the server must block or fail
with an error.
The question is if it makes sense to start a web server on a host before
system urandom is initialized.
The issues #25420 and #26839 are restricted to the Python startup, not
to generate a secret before the system urandom is initialized.
Fix system urandom
==================
Load entropy from disk at boot
-------------------------------
Collecting entropy can take several minutes. To accelerate the system
initialization, operating systems store entropy on disk at shutdown, and
then reload entropy from disk at the boot.
If a system collects enough entropy at least once, the system urandom
will be initialized quickly, as soon as the entropy is reloaded from
disk.
Virtual machines
----------------
Virtual machines don't have a direct access to the hardware and so have
less sources of entropy than bare metal. A solution is to add a
`virtio-rng device
<https://fedoraproject.org/wiki/Features/Virtio_RNG>`_ to pass entropy
from the host to the virtual machine.
Embedded devices
----------------
A solution for embedded devices is to plug an hardware RNG.
For example, Raspberry Pi have an hardware RNG but it's not used by
default. See: `Hardware RNG on Raspberry Pi
<http://fios.sector16.net/hardware-rng-on-raspberry-pi/>`_.
Denial-of-service when reading random
=====================================
The ``/dev/random`` device should only used for very specific use cases.
Reading from ``/dev/random`` on Linux is likely to block. Users don't
like when an application blocks longer than 5 seconds to generate a
secret. It is only expected for specific cases like generating
explicitly an encryption key.
When the system has no available entropy, choosing between blocking
until entropy is available or falling back on lower quality entropy is a
matter of compromise between security and practicability. The choice
depends on the use case.
On Linux, ``/dev/urandom`` is secure, it should be used instead of
``/dev/random``:
* `Myths about /dev/urandom <http://www.2uo.de/myths-about-urandom/>`_
by Thomas Hühn: "Fact: /dev/urandom is the preferred source of
cryptographic randomness on UNIX-like systems"
Rationale
=========
On Linux, reading the ``/dev/urandom`` can return "weak" entropy before
urandom is fully initialized, before the kernel collected 128 bits of
entropy. Linux 3.17 adds a new ``getrandom()`` syscall which allows to
block until urandom is initialized.
On Python 3.5.2, os.urandom() uses the ``getrandom(GRND_NONBLOCK)``, but
falls back on reading the non-blocking ``/dev/urandom`` if
``getrandom(GRND_NONBLOCK)`` fails with ``EAGAIN``.
Security experts promotes ``os.urandom()`` to genereate cryptographic
keys. By the way, ``os.urandom()`` is preferred over
``ssl.RAND_bytes()`` for different reasons.
This PEP proposes to modify os.urandom() to use ``getrandom()`` in
blocking mode to not return weak entropy, but also ensure that Python
will not block at startup.
Changes
=======
All changes described in this section are specific to the Linux
platform.
* Initialize hash secret from non-blocking system urandom
* Initialize ``random._inst`` with non-blocking system urandom
* Modify os.urandom() to block (until system urandom is initialized)
A new ``_PyOS_URandom_Nonblocking()`` private method is added: try to
call ``getrandom(GRND_NONBLOCK)``, but falls back on reading
``/dev/urandom`` if it fails with ``EAGAIN``.
``_PyRandom_Init()`` is modified to call
``_PyOS_URandom_Nonblocking()``. Moreover, a new ``random_inst_seed``
field is added to the ``_Py_HashSecret_t`` structure.
``random._inst`` (an instance of ``random.Random``) is initialized with
the new ``random_inst_seed`` secret. A ("fuse") flag is used to ensure
that this secret is only used once.
If a second instance of random.Random is created, blocking
``os.urandom()`` is used.
``os.urandom()`` (C function ``_PyOS_URandom()``) is modified to always
call ``getrandom(0)`` (blocking mode).
Alternative
===========
Never use blocking urandom in the random module
-----------------------------------------------
The random module can use ``random_inst_seed`` as a seed, but add other
sources of entropy like the process identifier (``os.getpid()``), the
current time (``time.time()``), memory addresses, etc.
Reading 2500 bytes from os.urandom() to initialize the Mersenne Twister
RNG in random.Random is a deliberate choice to get access to the full
range of the RNG. This PEP is a compromise between "security" and
"feature". Python should not block at startup before the OS collected
enough entropy. But on the regular use case (system urandom
iniitalized), the random module should continue to its code to
initialize the seed.
Python 3.5.0 was blocked on ``import random``, not on building a second
instance of ``random.Random``.
Leave os.urandom() unchanged, add os.getrandom()
------------------------------------------------
os.urandom() remains unchanged: never block, but it can return weak
entropy if system urandom is not initialized yet.
A new ``os.getrandom()`` function is added: thin wrapper to the
``getrandom()`` syscall.
Expected usage to write portable code::
def my_random(n):
if hasattr(os, 'getrandom'):
return os.getrandom(n, 0)
return os.urandom(n)
The problem with this change is that it expects that users understand
well security and know well each platforms. Python has the tradition of
hiding "implementation details". For example, ``os.urandom()`` is not a
thin wrapper to the ``/dev/urandom`` device: it uses
``CryptGenRandom()`` on Windows, it uses ``getentropy()`` on OpenBSD, it
tries ``getrandom()`` on Linux and Solaris or falls back on reading
``/dev/urandom``. Python already uses the best available system RNG
depending on the platform.
This PEP does not change the API which didn't change since the creation
of Python:
* ``os.urandom()``, ``random.SystemRandom`` and ``secrets`` for security
* ``random`` module (except ``random.SystemRandom``) for all other usages
Raise BlockingIOError in os.urandom()
-------------------------------------
This idea was proposed as a compromise to let developers decide themself
how to handle the case:
* catch the exception and uses another weaker entropy source: read
``/dev/urandom`` on Linux, the Python ``random`` module (which is not
secure at all), time, process identifier, etc.
* don't catch the error, the whole program fails with this fatal
exception
First of all, no user complained yet that ``os.urandom()`` blocks. This
point is currently theorical. The Python issues #25420 and #26839 were
restricted to the Python startup: users complained that Python was
blocked at startup.
Even if reading /dev/urandom block on OpenBSD, FreeBSD, Mac OS X, etc.
until urandom is initialized, no user complained yet because Python is
not used in the process initializing the system and /dev/urandom is
quickly initialized. It looks like only Linux users hit the problem on
virtual machines or embedded devices, and only in some short Python
scripts used to initialize the the system. Again, ``os.urandom()`` is
not used in such script (at least, not yet).
As `Leave os.urandom() unchanged, add os.getrandom()`_, the problem is
that it makes the API more complex and so more error-prone.
Add an optional block parameter to os.urandom()
-----------------------------------------------
Add an optional block parameter to os.urandom(). The default value may
be ``True`` (block by default) or ``False`` (non-blocking).
The first technical issue is to implement ``os.urandom(block=False)`` on
all platforms. On Linux 3.17 and newer has a well defined non-blocking
API.
See the `issue #27250: Add os.urandom_block()
<http://bugs.python.org/issue27250>`_.
As `Raise BlockingIOError in os.urandom()`_, it doesn't seem worth it to
make the API more complex for a theorical (or at least very rare) use
case.
As `Leave os.urandom() unchanged, add os.getrandom()`_, the problem is
that it makes the API more complex and so more error-prone.
Annexes
=======
Operating system random functions
---------------------------------
``os.urandom()`` uses the following functions:
* OpenBSD: `getentropy()
<http://man.openbsd.org/OpenBSD-current/man2/getentropy.2>`_
(OpenBSD 5.6)
* Linux: `getrandom()
<http://man7.org/linux/man-pages/man2/getrandom.2.html>`_ (Linux 3.17)
-- see also `A system call for random numbers: getrandom()
<https://lwn.net/Articles/606141/>`_
* Solaris: `getentropy()
<https://docs.oracle.com/cd/E53394_01/html/E54765/getentropy-2.html#scrolltoc>`_,
`getrandom()
<https://docs.oracle.com/cd/E53394_01/html/E54765/getrandom-2.html>`_
(both need Solaris 11.3)
* Windows: `CryptGenRandom()
<https://msdn.microsoft.com/en-us/library/windows/desktop/aa379942%28v=vs.85…>`_
(Windows XP)
* UNIX, BSD: /dev/urandom, /dev/random
* OpenBSD: /dev/srandom
On Linux, commands to get the status of ``/dev/random`` (results are
number of bytes)::
$ cat /proc/sys/kernel/random/entropy_avail
2850
$ cat /proc/sys/kernel/random/poolsize
4096
Why using os.urandom()?
-----------------------
Since ``os.urandom()`` is implemented in the kernel, it doesn't have
some issues of user-space RNG. For example, it is much harder to get its
state. It is usually built on a CSPRNG, so even if its state is get, it
is hard to compute previously generated numbers. The kernel has a good
knowledge of entropy sources and feed regulary the entropy pool.
Links
=====
* `Cryptographically secure pseudo-random number generator (CSPRNG)
<https://en.wikipedia.org/wiki/Cryptographically_secure_pseudorandom_number_…>`_
Copyright
=========
This document has been placed in the public domain.
3
4

June 23, 2016
Before I can possibly start thinking about what to do when the system's
CSPRNG is initialized, I need to understand more about how it works.
Apparently there's a possible transition from the "not ready yet" ("bad")
state to "ready" ("good"), and all it takes is usually waiting for a second
or two. But is this a wait that only gets incurred once, somewhere early
after a boot, or is this something that can happen at any time?
--
--Guido van Rossum (python.org/~guido)
5
9

June 21, 2016
Nick expressed:
> The *actual bug* that triggered this latest firestorm of commentary
> (from experts and non-experts alike) had *nothing* to do with user
> code calling os.urandom, and instead was a combination of:
>
> - CPython startup requesting cryptographically secure randomness when
> it didn't need it
> - a systemd init script written in Python running before the kernel
> RNG was fully initialised
>
> That created a deadlock between CPython startup and the rest of the
> Linux init process, so the latter only continued when the systemd
> watchdog timed out and killed the offending script. As others have
> noted, this kind of deadlock scenario is generally impossible on other
> operating systems, as the operating system doesn't provide a way to
> run Python code before the random number generator is ready.
>
> The change Victor made in 3.5.2 to fall back to reading /dev/urandom
> directly if the getrandom() syscall returns EAGAIN (effectively
> reverting to the Python 3.4 behaviour) was the simplest possible fix
> for that problem (and an approach I thoroughly endorse, both for 3.5.2
> and for the life of the 3.5 series), but that doesn't make it the
> right answer for 3.6+.
>
> To repeat: the problem encountered was NOT due to user code calling
> os.urandom(), but rather due to the way CPython initialises its own
> internal hash algorithm at interpreter startup. However, due to the
> way CPython is currently implemented, fixing the regression in that
> not only changed the behaviour of CPython startup, it *also* changed
> the behaviour of every call to os.urandom() in Python 3.5.2+.
>
> For 3.6+, we can instead make it so that the only things that actually
> rely on cryptographic quality randomness being available are:
>
> - calling a secrets module API
> - calling a random.SystemRandom method
> - calling os.urandom directly
>
> These are all APIs that were either created specifically for use in
> security sensitive situations (secrets module), or have long been
> documented (both within our own documentation, and in third party
> documentation, books and Q&A sites) as being an appropriate choice for
> use in security sensitive situations (os.urandom and
> random.SystemRandom).
>
> However, we don't need to make those block waiting for randomness to
> be available - we can update them to raise BlockingIOError instead
> (which makes it trivial for people to decide for themselves how they
> want to handle that case).
>
> Along with that change, we can make it so that starting the
> interpreter will never block waiting for cryptographic randomness to
> be available (since it doesn't need it), and importing the random
> module won't block waiting for it either.
>
> To the best of our knowledge, on all operating systems other than
> Linux, encountering the new exception will still be impossible in
> practice, as there is no known opportunity to run Python code before
> the kernel random number generator is ready.
>
> On Linux, init scripts may still run before the kernel random number
> generator is ready, but will now throw an immediate BlockingIOError if
> they access an API that relies on crytographic randomness being
> available, rather than potentially deadlocking the init process. Folks
> encountering that situation will then need to make an explicit
> decision:
>
> - loop until the exception is no longer thrown
> - switch to reading from /dev/urandom directly instead of calling os.urandom()
> - switch to using a cross-platform non-cryptographic API (probably the
> random module)
>
> Victor has some additional technical details written up at
> http://haypo-notes.readthedocs.io/pep_random.html and I'd be happy to
> formalise this proposed approach as a PEP (the current reference is
> http://bugs.python.org/issue27282 )
and Nathaniel added:
> I'd make two additional suggestions:
>
> - one person did chime in on the thread to say that they've used
> os.urandom for non-security-sensitive purposes, simply because it
> provided a convenient "give me a random byte-string" API that is
> missing from random. I think we should go ahead and add a .randbytes
> method to random.Random that simply returns a random bytestring using
> the regular RNG, to give these users a nice drop-in replacement for
> os.urandom.
>
> Rationale: I don't think the existence of these users should block
> making os.urandom appropriate for generating secrets, because (1) a
> glance at github shows that this is very unusual -- if you skim
> through this search you get page after page of functions with names
> like "generate_secret_key"
>
> https://github.com/search?l=python&p=2&q=urandom&ref=searchresults&type=Cod…
>
> and (2) for the minority of people who are using os.urandom for
> non-security-sensitive purposes, if they find os.urandom raising an
> error, then this is just a regular bug that they will notice
> immediately and fix, and anyway it's basically never going to happen.
> (As far as we can tell, this has never yet happened in the wild, even
> once.) OTOH if os.urandom is allowed to fail silently, then people who
> are using it to generate secrets will get silent catastrophic
> failures, plus those users can't assume it will never happen because
> they have to worry about active attackers trying to drive systems into
> unusual states. So I'd much rather ask the non-security-sensitive
> users to switch to using something in random, than force the
> cryptographic users to switch to using secrets. But it does seem like
> it would be good to give those non-security-sensitive users something
> to switch to .
>
> - It's not exactly true that the Python interpreter doesn't need
> cryptographic randomness to initialize SipHash -- it's more that
> *some* Python invocations need unguessable randomness (to first
> approximation: all those which are exposed to hostile input), and some
> don't. And since the Python interpreter has no idea which case it's
> in, and since it's unacceptable for it to break invocations that don't
> need unguessable hashes, then it has to err on the side of continuing
> without randomness. All that's fine.
>
> But, given that the interpreter doesn't know which state it's in,
> there's also the possibility that this invocation *will* be exposed to
> hostile input, and the 3.5.2+ behavior gives absolutely no warning
> that this is what's happening. So instead of letting this potential
> error pass silently, I propose that if SipHash fails to acquire real
> randomness at startup, then it should issue a warning. In practice,
> this will almost never happen. But in the rare cases it does, it at
> least gives the user a fighting chance to realize that their system is
> in a potentially dangerous state. And by using the warnings module, we
> automatically get quite a bit of flexibility. If some particular
> invocation (e.g. systemd-cron) has audited their code and decided that
> they don't care about this issue, they can make the message go away:
>
> PYTHONWARNINGS=ignore::NoEntropyAtStartupWarning
>
> OTOH if some particular invocation knows that they do process
> potentially hostile input early on (e.g. cloud-init, maybe?), then
> they can explicitly promote the warning to an error:
>
> PYTHONWARNINGS=error::NoEntropyAtStartupWarning
>
> (I guess the way to implement this would be for the SipHash
> initialization code -- which runs very early -- to set some flag, and
> then we expose that flag in sys._something, and later in the startup
> sequence check for it after the warnings module is functional.
> Exposing the flag at the Python level would also make it possible for
> code like cloud-init to do its own explicit check and respond
> appropriately.)
Victor, does your PEP differ from these proposals? (my apologies for my
lack of time at the moment).
--
~Ethan~
1
0
Hi,
Warning: believe me or not, I only read the first ~50 messages of the
recent discussion about random on the Python bug tracker and then the
python-dev mailing list.
Warning 2: If this email thread gets 100 emails per day as it was the
case on the bug tracker and python-dev, I will have to ignore it
again. Sorry, but I don't have the bandwith to read so much messages
:-(
Here is a concrete proposal trying to make Python 3.6 more secure on
Linux, without blocking Python at startup.
I suggest to stick to Linux first. Sorry, but I don't have the skills
to propose a concrete change for other platforms since I don't know
well their exact behaviour, and I'm not that they give access to
blocking *and* non-blocking urandom.
Victor
HTML version:
https://haypo-notes.readthedocs.io/pep_random.html
+++++++++++++++++++++++++++++++++++
Make os.urandom() blocking on Linux
+++++++++++++++++++++++++++++++++++
Headers::
PEP: xxx
Title: Make os.urandom() blocking on Linux
Version: $Revision$
Last-Modified: $Date$
Author: Victor Stinner <victor.stinner(a)gmail.com>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 20-June-2016
Python-Version: 3.6
Abstract
========
Modify ``os.urandom()`` to block on Linux 3.17 and newer until the OS
urandom is initialized.
Rationale
=========
Linux 3.17 adds a new ``getrandom()`` syscall which allows to block
until the kernel collected enough entropy. It avoids to generate weak
cryptographic keys.
Python os.urandom() uses the ``getrandom()``, but falls back on reading
the non-blocking ``/dev/urandom`` if ``getrandom(GRND_NONBLOCK)`` fails
with ``EAGAIN``.
Security experts promotes ``os.urandom()`` to genereate cryptographic
keys, even instead of ``ssl.RAND_bytes()``.
Python 3.5.0 blocked at startup on virtual machines, waiting for the OS
urandom initialization, which was seen as a regression compared to
Python 3.4 by users.
This PEP proposes to modify os.urandom() to more is more secure, but
also ensure that Python will not block at startup.
Changes
=======
* Initialize hash secret from non-blocking OS urandom
* Initialize random._inst, a Random instance, with non-blocking OS
urandom
* Modify os.urandom() to block until urandom is initialized on Linux
A new _PyOS_URandom_Nonblocking() private method will be added: read OS
urandom in non-blocking mode. In practice, it means that it falls back
on reading /dev/urandom on Linux.
_PyRandom_Init() is modified to call _PyOS_URandom_Nonblocking().
Moreover, a new ``random_inst_seed`` will be added to the
``_Py_HashSecret_t`` structure (see above).
random._inst will be initialized with the ``random_inst_seed`` secret. A
flag will be used to ensure that this secret is only used once.
If a second instance of random.Random is created, blocking os.urandom()
will be used.
Alternative
===========
Never use blocking urandom in the random module
-----------------------------------------------
The random module can use ``random_inst_seed`` as a seed, but add other
sources of entropy like the process identifier (``os.getpid()``), the
current time (``time.time()``), memory addresses, etc.
Reading 2500 bytes from os.urandom() to initialize the Mersenne Twister
RNG in random.Random is a deliberate choice to get access to the full
range of the RNG. This PEP is a compromise between "security" and
"feature". Python should not block at startup before the OS collected
enough entropy. But on the regular use case (OS urandom iniitalized),
the random module should continue to its code to initialize the seed.
Python 3.5.0 was blocked on ``import random``, not on building a second
instance of ``random.Random``.
Annexes
=======
Why using os.urandom()?
-----------------------
Since ``os.urandom()`` is implemented in the kernel, it doesn't have
some issues of user-space RNG. For example, it is much harder to get its
state. It is usually built on a CSPRNG, so even if its state is get, it
is hard to compute previously generated numbers. The kernel has a good
knowledge of entropy sources and feed regulary the entropy pool.
Linux getrandom()
-----------------
On OpenBSD, FreeBSD and Mac OS X, reading /dev/urandom blocks until the
kernel collected enough entropy. It is not the case on Linux. Basically,
if a design choice should be make between usability and security,
usability is preferred on Linux, whereas security is preferred on BSD.
The new ``getrandom()`` of Linux 3.17 allows users to choose security be
blocking until the kernel collected enough entropy.
On virtual machines and some embedded devices, it can take longer than a
minute to collect enough entropy. In the worst case, the application
will block forever because the kernel really has no entropy source and
so cannot unblock ``getrandom()``.
Copyright
=========
This document has been placed in the public domain.
1
0