RFC: PEP: Make os.urandom() blocking on Linux
I'm not sure that anyone got my email since I sent it like 1 hour after Ethan announced the creation of the list, so I repost my email ;-) -- Hi, Warning: believe me or not, I only read the first ~50 messages of the recent discussion about random on the Python bug tracker and then the python-dev mailing list. Warning 2: If this email thread gets 100 emails per day as it was the case on the bug tracker and python-dev, I will have to ignore it again. Sorry, but I don't have the bandwith to read so much messages :-( Here is a concrete proposal trying to make Python 3.6 more secure on Linux, without blocking Python at startup. I suggest to stick to Linux first. Sorry, but I don't have the skills to propose a concrete change for other platforms since I don't know well their exact behaviour, and I'm not that they give access to blocking *and* non-blocking urandom. Victor HTML version: https://haypo-notes.readthedocs.io/pep_random.html +++++++++++++++++++++++++++++++++++ Make os.urandom() blocking on Linux +++++++++++++++++++++++++++++++++++ Headers:: PEP: xxx Title: Make os.urandom() blocking on Linux Version: $Revision$ Last-Modified: $Date$ Author: Victor Stinner <victor.stinner@gmail.com> Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 20-June-2016 Python-Version: 3.6 Abstract ======== Modify ``os.urandom()`` to block on Linux 3.17 and newer until the OS urandom is initialized. Rationale ========= Linux 3.17 adds a new ``getrandom()`` syscall which allows to block until the kernel collected enough entropy. It avoids to generate weak cryptographic keys. Python os.urandom() uses the ``getrandom()``, but falls back on reading the non-blocking ``/dev/urandom`` if ``getrandom(GRND_NONBLOCK)`` fails with ``EAGAIN``. Security experts promotes ``os.urandom()`` to genereate cryptographic keys, even instead of ``ssl.RAND_bytes()``. Python 3.5.0 blocked at startup on virtual machines, waiting for the OS urandom initialization, which was seen as a regression compared to Python 3.4 by users. This PEP proposes to modify os.urandom() to more is more secure, but also ensure that Python will not block at startup. Changes ======= * Initialize hash secret from non-blocking OS urandom * Initialize random._inst, a Random instance, with non-blocking OS urandom * Modify os.urandom() to block until urandom is initialized on Linux A new _PyOS_URandom_Nonblocking() private method will be added: read OS urandom in non-blocking mode. In practice, it means that it falls back on reading /dev/urandom on Linux. _PyRandom_Init() is modified to call _PyOS_URandom_Nonblocking(). Moreover, a new ``random_inst_seed`` will be added to the ``_Py_HashSecret_t`` structure (see above). random._inst will be initialized with the ``random_inst_seed`` secret. A flag will be used to ensure that this secret is only used once. If a second instance of random.Random is created, blocking os.urandom() will be used. Alternative =========== Never use blocking urandom in the random module ----------------------------------------------- The random module can use ``random_inst_seed`` as a seed, but add other sources of entropy like the process identifier (``os.getpid()``), the current time (``time.time()``), memory addresses, etc. Reading 2500 bytes from os.urandom() to initialize the Mersenne Twister RNG in random.Random is a deliberate choice to get access to the full range of the RNG. This PEP is a compromise between "security" and "feature". Python should not block at startup before the OS collected enough entropy. But on the regular use case (OS urandom iniitalized), the random module should continue to its code to initialize the seed. Python 3.5.0 was blocked on ``import random``, not on building a second instance of ``random.Random``. Annexes ======= Why using os.urandom()? ----------------------- Since ``os.urandom()`` is implemented in the kernel, it doesn't have some issues of user-space RNG. For example, it is much harder to get its state. It is usually built on a CSPRNG, so even if its state is get, it is hard to compute previously generated numbers. The kernel has a good knowledge of entropy sources and feed regulary the entropy pool. Linux getrandom() ----------------- On OpenBSD, FreeBSD and Mac OS X, reading /dev/urandom blocks until the kernel collected enough entropy. It is not the case on Linux. Basically, if a design choice should be make between usability and security, usability is preferred on Linux, whereas security is preferred on BSD. The new ``getrandom()`` of Linux 3.17 allows users to choose security be blocking until the kernel collected enough entropy. On virtual machines and some embedded devices, it can take longer than a minute to collect enough entropy. In the worst case, the application will block forever because the kernel really has no entropy source and so cannot unblock ``getrandom()``. Copyright ========= This document has been placed in the public domain.
On Jun 21, 2016, at 04:10 PM, Victor Stinner wrote:
PEP: xxx Title: Make os.urandom() blocking on Linux Version: $Revision$ Last-Modified: $Date$ Author: Victor Stinner <victor.stinner@gmail.com> Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 20-June-2016 Python-Version: 3.6 [...]
Alternative ===========
I would like to ask for some changes to this proto-PEP. At a minimum, I think a proper treatment of the alternative where os.urandom() remains (on Linux at least) a thin wrapper around /dev/urandom. We would add os.getrandom() as the low-level interface to the new C lib function, and expose any higher level functionality in the secrets module if necessary. Then we would also add a strong admonition to the documentation explaining the trade-offs between os.urandom() and os.getrandom() and point people to the latter for strong crypto use cases. Your proto-PEP uses this as a rationale: Security experts promotes ``os.urandom()`` to genereate cryptographic keys, even instead of ``ssl.RAND_bytes()``. and that's been a commonly cited reason for why strengthening os.urandom() is preferable to adding a more direct mapping to the underlying function that provides that strengthened randomness. If if the assertion is true -and respectfully, it isn't backed up by any actual citations in the proto-PEP- it doesn't make it right. It's also a bad precedence to follow IMHO. Where do we draw the line in changing existing APIs to their use or misuse as the case may be? We can discuss whether your proposal or my[*] alternative is the right one for Python to follow, and I may lose that argument, but I think it's only proper and fair to represent this point of view in this proto-PEP. I do not think a separate competing PEP is appropriate. I should also note that my proposed alternative would make the title incorrect, so I'd like to suggestion something like: "Providing a cryptographically strong source of random bytes." Cheers, -Barry [*] Although labeling it "my" gives me undo credit for points of view also held and suggested by others; it's just a handy way of referring to it.
On 21 Jun 2016, at 23:57, Barry Warsaw <barry@python.org> wrote:
At a minimum, I think a proper treatment of the alternative where os.urandom() remains (on Linux at least) a thin wrapper around /dev/urandom. We would add os.getrandom() as the low-level interface to the new C lib function, and expose any higher level functionality in the secrets module if necessary. Then we would also add a strong admonition to the documentation explaining the trade-offs between os.urandom() and os.getrandom() and point people to the latter for strong crypto use cases.
I’d like to explore this approach further. In a model like this, os.getrandom() would basically need to have, in its documentation, a recipe for using it in a general-purpose, cross-OS manner. That recipe would be, at minimum, an admonition to use the secrets module. However, if we’re going to implement an entire function in order to say “Do not use this, use secrets instead”, why are we bothering? Why add the API surface and a function that needs to be maintained? Why not just make the use of getrandom a private implementation detail of secrets? Making getrandom() a private detail of secrets has the advantage of freeing us from some backward compatibility concerns, which as we’ve identified are a real problem here. Given that there’s no understandable use case where someone would write anything but "try: os.getrandom(); except AttributeError: os.urandom”, it doesn’t seem sensible to give people the option to get this wrong. The other way to approach this is to have os.getrandom() do the appropriate dance, but others have suggested that the os module is intended only to be thin wrappers around things that the OS provides (a confusing argument given that closerange() exists, but that’s by the by).
Your proto-PEP uses this as a rationale:
Security experts promotes ``os.urandom()`` to genereate cryptographic keys, even instead of ``ssl.RAND_bytes()``.
and that's been a commonly cited reason for why strengthening os.urandom() is preferable to adding a more direct mapping to the underlying function that provides that strengthened randomness. If if the assertion is true -and respectfully, it isn't backed up by any actual citations in the proto-PEP- it doesn't make it right. It's also a bad precedence to follow IMHO. Where do we draw the line in changing existing APIs to their use or misuse as the case may be?
Here’s some relevant citations: - https://stackoverflow.com/questions/10341112/whats-more-random-hashlib-or-ur... - https://cryptography.io/en/latest/random-numbers/ - https://code.google.com/p/googleappengine/issues/detail?id=1055 However, I don’t think I agree with your assertion that it’s a bad precedent. I think the bad precedent is introducing new functions that do what the old functions should have done. Some examples of this: - yaml.safe_load, introduced to replace yaml.load which leads to documents like this: https://security.openstack.org/guidelines/dg_avoid-dangerous-input-parsing-l... - PHP’s mysql_real_escape_string, introduced to replace mysql_escape_string, which leads to misguided questions like this one: https://security.stackexchange.com/questions/8028/does-mysql-escape-string-h... Each of these functions has been a never-ending supply of security vulnerabilities because they encourage users to fall into a pit of failure. Users who are not sufficiently defensive when approaching their code will reach for the most obvious tool in the box, and the Python cryptographic community has spent a long time making os.urandom() the most obvious tool in the box because no other tool was available. The argument, then, is that we should make that tool better, rather than build a new tool and let the old one fester.
We can discuss whether your proposal or my[*] alternative is the right one for Python to follow, and I may lose that argument, but I think it's only proper and fair to represent this point of view in this proto-PEP. I do not think a separate competing PEP is appropriate.
I agree with this. The PEP should accurately represent competing views, even if it doesn’t agree with them. Cory
On Jun 22, 2016, at 11:13 AM, Cory Benfield wrote:
In a model like this, os.getrandom() would basically need to have, in its documentation, a recipe for using it in a general-purpose, cross-OS manner. That recipe would be, at minimum, an admonition to use the secrets module.
However, if we’re going to implement an entire function in order to say “Do not use this, use secrets instead”, why are we bothering? Why add the API surface and a function that needs to be maintained? Why not just make the use of getrandom a private implementation detail of secrets?
Because the os module has traditionally surfaced lower-level operating system functions, so os.getrandom() would be an extension of this. That's also why I advocate simplifying os.urandom() so that it reverts more or less to exposing /dev/urandom to Python. With perhaps a few exceptions, os doesn't provide higher level APIs. The point here is that, let's say you're an experienced Linux developer and you know you want to use getrandom(2) in Python. os.getrandom() is exactly that. It's completely analogous to why we provide, e.g. os.chroot() and such. Now, let's say you just want some guaranteed high quality random bytes, and you don't really know or care what's being used. The lower level os functions are *not* the right APIs to use, but secrets is. That's why the documentation points people over there for better, higher-level APIs, and it's there that we have the freedom to change underlying implementation as needed to deliver on the promised improved security.
Making getrandom() a private detail of secrets has the advantage of freeing us from some backward compatibility concerns, which as we’ve identified are a real problem here.
I agree.
Given that there’s no understandable use case where someone would write anything but "try: os.getrandom(); except AttributeError: os.urandom”, it doesn’t seem sensible to give people the option to get this wrong.
This doesn't follow though. Again, it's about providing low-level Python bindings to underlying operation system functions in os, and higher level APIs with more cross-platform guarantees in secret.
The other way to approach this is to have os.getrandom() do the appropriate dance, but others have suggested that the os module is intended only to be thin wrappers around things that the OS provides (a confusing argument given that closerange() exists, but that’s by the by).
As I mentioned, there are exceptions (os.makedirs() is the other one that comes to mind), but I do think the rule for os should be -and has usually traditionally been- exactly as you say. OTOH, neither os.makedirs() nor os.closerange() are that far removed from their lower level cousins, so that's a practicality over purity justification.
However, I don’t think I agree with your assertion that it’s a bad precedent. I think the bad precedent is introducing new functions that do what the old functions should have done.
I don't agree that any of this is what os.urandom() should have done. It's that people have used it for other purposes and changed what they think it should have done. Now we're redefining os.urandom() to fit that new purpose. That's the bad precedence IMHO. Cheers, -Barry
Barry, Cory, et al: We all know there are two camps here: - Those that want "secure by default" behavior, and - Those that want "thin wrapper" behavior. We have discussed the reasoning behind those two camps ad nauseam on Python Dev, with fairly disastrous results. I did not create this list so we could do it again. At this point we have two PEPs going. Let's make sure that whichever PEP we take back to Py-Dev includes all the arguments and objections noted, and then let Guido or his delegate make the final call. Please. -- ~Ethan~
On Jun 22, 2016, at 06:29 PM, Ethan Furman wrote:
We have discussed the reasoning behind those two camps ad nauseam on Python Dev, with fairly disastrous results. I did not create this list so we could do it again.
At this point we have two PEPs going. Let's make sure that whichever PEP we take back to Py-Dev includes all the arguments and objections noted, and then let Guido or his delegate make the final call.
Yes, agreed. Since this is a new list with a new proposed PEP, I want to be sure that my view is accurately represented. I won't continue to push it and don't plan on responding unless my position isn't accurately represented. Once Guido or his delegate makes the call, it's over. Cheers, -Barry
On 22 June 2016 at 17:35, Barry Warsaw <barry@python.org> wrote:
On Jun 22, 2016, at 11:13 AM, Cory Benfield wrote:
In a model like this, os.getrandom() would basically need to have, in its documentation, a recipe for using it in a general-purpose, cross-OS manner. That recipe would be, at minimum, an admonition to use the secrets module.
However, if we’re going to implement an entire function in order to say “Do not use this, use secrets instead”, why are we bothering? Why add the API surface and a function that needs to be maintained? Why not just make the use of getrandom a private implementation detail of secrets?
Because the os module has traditionally surfaced lower-level operating system functions, so os.getrandom() would be an extension of this. That's also why I advocate simplifying os.urandom() so that it reverts more or less to exposing /dev/urandom to Python. With perhaps a few exceptions, os doesn't provide higher level APIs.
The point here is that, let's say you're an experienced Linux developer and you know you want to use getrandom(2) in Python. os.getrandom() is exactly that. It's completely analogous to why we provide, e.g. os.chroot() and such.
My own objection (as spelled out in PEP 522) is only to leaving os.urandom() silently broken when we have the ability to improve on that - it's an "errors pass silently" and "guessing in the face of ambiguity" scenario that we previously couldn't sensibly do anything about, but now have additional options to better handle on behalf of our users. As long as os.urandom() is fixed to fail cleanly rather than silently, I don't object to exposing os.getrandom() as well for the sake of folks writing Linux specific software that want direct access to the kernel's blocking behaviour rather than a busy loop. I *do* object to any solution that proposes that all correct cross-platform code that needs reliably unpredictable random data necessarily end up looking like: try: my_random = os.getrandom except AttributeError: my_random = os.urandom WIth the simpler and cleaner "my_random = os.urandom" continuing to risk silent security failures if the software is used in an unanticipated context. Instead, I'm after an outcome for os.urandom() akin to that in PEP 418, where time.time() now looks for several other preferred options before falling back to _time.time() as a last resort: https://www.python.org/dev/peps/pep-0418/#time-time Even if we did add a blocking getrandom() though, I'd still advocate for secrets and random.SystemRandom to throw BlockingIOError by default - with system RNG initialisation being a "once and done" thing and os.getrandom() exposed, it becomes straightforward to add an application level "Wait for the random number generator to be ready" check: try: wait_for_system_rng = os.getrandom except AttributeError: pass else: wait_for_system_rng(1) The hard part is then knowing that your *need* to wait. If you're silently getting more-predictable-than-you-expected random data, you may never realise. If your system hangs, you might eventually figure it out, but only after a likely frustrating debugging effort. By contrast, if your application fails with "BlockingIOError: system random number generator not ready", then you can search for that on the internet, see the above snippet for "How to wait for the system random number generator to be ready on Linux" and stick that into your code. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Jun 22, 2016, at 06:31 PM, Nick Coghlan wrote:
try: my_random = os.getrandom except AttributeError: my_random = os.urandom
Once Python 3.6 is widely available, and/or secrets is backported and available on PyPI, why would you ever do that rather than just get the best source of randomness out of the secrets module? Cheers, -Barry
On Jun 23, 2016, at 8:48 AM, Barry Warsaw <barry@python.org> wrote:
On Jun 22, 2016, at 06:31 PM, Nick Coghlan wrote:
try: my_random = os.getrandom except AttributeError: my_random = os.urandom
Once Python 3.6 is widely available, and/or secrets is backported and available on PyPI, why would you ever do that rather than just get the best source of randomness out of the secrets module?
Because projects are likely going to be supporting things other than 3.6 for a very long time. The “typical” support matrix for a project on PyPI currently looks roughly like 2.6, 2.7, and 3.3+. We’re seeing some projects dropping 2.6 finally on PyPI but it’s still a major source of downloads and 2.7 itself is still ~86% of downloads initiated by pip across all of PyPI. There is the idea of a secrets module back port on PyPI, but without adding C code to that it’s going to basically just do the same thing as that try … except and if the secrets backport requires C I think you won’t get a very large uptick since os.urandom exists already and the issues are subtle enough that I don’t think most people are going to grok them immediately and will just automatically avoid a C dependency where they don’t immediately see the need for one. Even if we pretend that 3.6+ only is something that’s going to happen in anything approaching a short timeline, we’re still going to be fighting against the tide for what the vast bulk of documentation out there states to do. So not only do we need to wait it out for pre 3.6 to die out, but we also need to wait it out for the copious amounts of third party documentation out there telling people to just use os.urandom dies. And even in the future, once we get to a 3.6+ only world, os.urandom and the try .. except shim will still “work” for all anyone can tell (since the failure mode on os.urandom itself is practically silent in every way imaginable) so unless they already know about this issue and go out of their way to switch over to the secrets module, they’re likely to continue using something in the os module for a long time. IOW, I think secrets is great, but I think it mostly helps new code written targeting 3.6+ only, rather than being a solution for the vast bulk of software already out there or which doesn’t yet exist but is going to support older things than 3.6. — Donald Stufft
On 23 June 2016 at 06:54, Donald Stufft <donald@stufft.io> wrote:
On Jun 23, 2016, at 8:48 AM, Barry Warsaw <barry@python.org> wrote:
On Jun 22, 2016, at 06:31 PM, Nick Coghlan wrote:
try: my_random = os.getrandom except AttributeError: my_random = os.urandom
Once Python 3.6 is widely available, and/or secrets is backported and available on PyPI, why would you ever do that rather than just get the best source of randomness out of the secrets module?
Because projects are likely going to be supporting things other than 3.6 for a very long time. The “typical” support matrix for a project on PyPI currently looks roughly like 2.6, 2.7, and 3.3+. We’re seeing some projects dropping 2.6 finally on PyPI but it’s still a major source of downloads and 2.7 itself is still ~86% of downloads initiated by pip across all of PyPI.
RIght, the missing qualifier on my statement is that one of the key aspects I'm specifically interested in the guidance we give to folks writing single source compatible Python 2/3 code that *also* want to use the best available initialization option given the vagaries of build platform, deployment platform, and the precise versions of those. Reasonable developer experience: * just keep using os.urandom(), Python will transparently upgrade your code to the best non-blocking-in-practice system interface the OS has to offer * if os.urandom() throws BlockingIOError, you may need to add application startup code to wait until the system random number generator is ready Dubious developer experience: * if osgetrandom() is available use that, otherwise use os.urandom() Dubious developer experience: * if the secrets module is available use that, otherwise use os.urandom() Dubious developer experience: * add a dependency on a third party library which implements one of the above dubious options For folks that don't need to worry about compatibility with old versions, the guidance will be "just use the secrets module" regardless of what we do with os.urandom(), and that's fine. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On 23 June 2016 at 10:31, Nick Coghlan <ncoghlan@gmail.com> wrote:
Reasonable developer experience:
* just keep using os.urandom(), Python will transparently upgrade your code to the best non-blocking-in-practice system interface the OS has to offer * if os.urandom() throws BlockingIOError, you may need to add application startup code to wait until the system random number generator is ready
Thinking about this some more, I realised applications can implement the "Wait for system RNG" behaviour even without os.getrandom: # Busy loop, given PEP 522's BlockingIOError def wait_for_system_rng(): while True: try: os.urandom(1) break except BlockingIOError: continue # An actual use case for reading /dev/random! def wait_for_system_rng(): try: block_on_system_rng = open("/dev/random", "rb") except FileNotFoundError: return with block_on_system_rng: block_on_system_rng.read(1) That second one has the added bonus of doing the right thing even on older Linux kernels that don't provide the new getrandom() syscall, creating the following virtuous feedback loop: 1. Start running an existing application/script on Python 3.6 and a Linux kernel with getrandom() 2. Start getting "BlockingIOError: system random number generator not ready" 3. Add the /dev/random snippet to wait for the system RNG 4. Your code now does the right thing even on older Pythons and Linux versions Given that realisation, I'm back to thinking "We don't need it" when it comes to exposing os.getrandom() directly. Regards, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Jun 23, 2016, at 2:10 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
That second one has the added bonus of doing the right thing even on older Linux kernels that don't provide the new getrandom() syscall, creating the following virtuous feedback loop:
The second one also is not a good idea to use in the general case since it will also block randomly throughout the application. It’s OK to use if you know you’re only going to access it once on boot, but you wouldn’t want it to be a common idiom that software itself does. If I recall, there was major downtime on healthcare.gov because they used /dev/random in production. — Donald Stufft
On 23 June 2016 at 11:13, Donald Stufft <donald@stufft.io> wrote:
On Jun 23, 2016, at 2:10 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
That second one has the added bonus of doing the right thing even on older Linux kernels that don't provide the new getrandom() syscall, creating the following virtuous feedback loop:
The second one also is not a good idea to use in the general case since it will also block randomly throughout the application. It’s OK to use if you know you’re only going to access it once on boot, but you wouldn’t want it to be a common idiom that software itself does. If I recall, there was major downtime on healthcare.gov because they used /dev/random in production.
Right, the idiom I'd be recommending in PEP 522 is a "Do this once in __main__ to categorically prevent BlockingIOError from os.urandom, random.SystemRandom and the secrets module" application level approach, while the guidance for libraries would be to just keep using os.urandom() and let affected application developers worry about whether to catch the BlockingIOError at point of use, or block the application at startup to wait for the system RNG. Although now I'm wondering whether it might be worth proposing a "secrets.wait_for_system_rng()" API as part of PEP 522, with the following implementation: def wait_for_system_rng(): # Avoid the below busy loop if possible try: block_on_system_rng = open("/dev/random", "rb") except FileNotFoundError: pass else: with block_on_system_rng: block_on_system_rng.read(1) # Busy loop until the system RNG is ready while True: try: os.urandom(1) break except BlockingIOError: pass Since this is an "at most once at application startup" kind of problem, I like the way that having a separate function for waiting helps to divide responsibilities between library API developers ("complain if you need the system RNG and it isn't ready") and application developers ("ensure the system RNG is ready before calling APIs that need it"). Regards, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On 23 June 2016 at 11:38, Nick Coghlan <ncoghlan@gmail.com> wrote:
Although now I'm wondering whether it might be worth proposing a "secrets.wait_for_system_rng()" API as part of PEP 522, with the following implementation:
def wait_for_system_rng(): # Avoid the below busy loop if possible try: block_on_system_rng = open("/dev/random", "rb") except FileNotFoundError: pass else: with block_on_system_rng: block_on_system_rng.read(1) # Busy loop until the system RNG is ready while True: try: os.urandom(1) break except BlockingIOError: pass
I realised even this more complex variant still has a subtle bug: due to the way /dev/random works, it can block inappropriately if Python is started after the system RNG has already been seeded. That means a completely correct implementation (assuming the rest of PEP 522 was in place) would look more like this: def wait_for_system_rng(): # If the system RNG is already seeded, don't wait at all try: os.urandom(1) return except BlockingIOError: pass # Avoid the below busy loop if possible try: block_on_system_rng = open("/dev/random", "rb") except FileNotFoundError: pass else: with block_on_system_rng: block_on_system_rng.read(1) # Busy loop until the system RNG is ready while True: try: os.urandom(1) break except BlockingIOError: # Only check once per millisecond time.sleep(0.001) So I'll update PEP 522 to include this as part of the proposal - it's trickier to get right than I thought, and it provides an additional hook to help explain that the system RNG is something that once initialized, stays initialized, so waiting for it is best handled as an application level and system configuration concern rather than on each call to os.urandom(). It also enables a pretty neat ExecStartPre [1] trick in systemd unit files: ExecStartPre=/usr/bin/python3 -c "import secrets; secrets.wait_for_system_rng()" to make an arbitrary service wait until the system RNG is ready before it runs. Cheers, Nick. [1] https://www.freedesktop.org/software/systemd/man/systemd.service.html#ExecSt... -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Jun 23, 2016, at 09:54 AM, Donald Stufft wrote:
Because projects are likely going to be supporting things other than 3.6 for a very long time. The “typical” support matrix for a project on PyPI currently looks roughly like 2.6, 2.7, and 3.3+. We’re seeing some projects dropping 2.6 finally on PyPI but it’s still a major source of downloads and 2.7 itself is still ~86% of downloads initiated by pip across all of PyPI. There is the idea of a secrets module back port on PyPI, but without adding C code to that it’s going to basically just do the same thing as that try … except and if the secrets backport requires C I think you won’t get a very large uptick since os.urandom exists already and the issues are subtle enough that I don’t think most people are going to grok them immediately and will just automatically avoid a C dependency where they don’t immediately see the need for one.
Even if we pretend that 3.6+ only is something that’s going to happen in anything approaching a short timeline, we’re still going to be fighting against the tide for what the vast bulk of documentation out there states to do. So not only do we need to wait it out for pre 3.6 to die out, but we also need to wait it out for the copious amounts of third party documentation out there telling people to just use os.urandom dies.
And even in the future, once we get to a 3.6+ only world, os.urandom and the try .. except shim will still “work” for all anyone can tell (since the failure mode on os.urandom itself is practically silent in every way imaginable) so unless they already know about this issue and go out of their way to switch over to the secrets module, they’re likely to continue using something in the os module for a long time.
IOW, I think secrets is great, but I think it mostly helps new code written targeting 3.6+ only, rather than being a solution for the vast bulk of software already out there or which doesn’t yet exist but is going to support older things than 3.6.
The proposed os.urandom() change is only going into Python 3.6, so older Python users will still be "vulnerable" to the problem until they upgrade. And without a backported secrets module, they won't have any way to benefit from the entropy guarantees until they upgrade. If secrets is backported and available in PyPI, then we can start immediately changing the os.urandom() meme to something more secure. Sure it takes a long time to change minds, but I still think it's better to give users a blessed, near universally agreed upon, secure alternative immediately. Cheers, -Barry
On 24 June 2016 at 06:48, Barry Warsaw <barry@python.org> wrote:
If secrets is backported and available in PyPI, then we can start immediately changing the os.urandom() meme to something more secure. Sure it takes a long time to change minds, but I still think it's better to give users a blessed, near universally agreed upon, secure alternative immediately.
It's not that simple, as secrets relies on the os module to provide access to the getrandom() syscall (by way of an upgraded os.urandom). Nothing changes from a security perspective without that additional level of access to the underlying operating system capabilities. You could potentially go the ctypes route in a PyPI module, but the performance would be abysmal, so nobody would use it. Going for a custom C extension doesn't really work either - you can't use manylinux1 for it (as the baseline glibc ABI is way too old to include getrandom), and nobody's going to want to introduce an install time compiler dependency just to address this relatively obscure concern. Even if those problems could be resolved, it isn't really a problem where I'd advocate for a "standard library only" project to add an external dependency to address it - if they're going to do that, I'd instead advocate for them to stop reinventing the wheel, and instead reach for a third party library that solves their *actual* problem (like cryptography, passlib, or one of the web frameworks). This is why I think it makes sense to focus the immediate discussion on "Given getrandom() as an operating system API, can we improve the semantics of Python's os.urandom()?". The wider discussion around "How do we educate Python developers on the difference between simulated uncertainty and sensitive secrets?" that motivated the introduction of the secrets module isn't really applicable. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
2016-06-23 14:48 GMT+02:00 Barry Warsaw <barry@python.org>:
Once Python 3.6 is widely available, and/or secrets is backported and available on PyPI, why would you ever do that rather than just get the best source of randomness out of the secrets module?
Once we modified Python 3.6 to handle correctly "the bug" and we consider that the implementation is tested enough, I suggest to backport it to Python 2.7 as well. Moreover, I would also suggest to backport the change to Python 3.5, I would be sad if Python 2 is more secure than the latest Python 3 release :-) Victor
On Jun 24, 2016, at 12:11 AM, Victor Stinner wrote:
Once we modified Python 3.6 to handle correctly "the bug" and we consider that the implementation is tested enough, I suggest to backport it to Python 2.7 as well. Moreover, I would also suggest to backport the change to Python 3.5, I would be sad if Python 2 is more secure than the latest Python 3 release :-)
This is the fundamental point of disagreement, and I think it points again to a deficiency in our process. Regardless of outcome of this specific case, I think we should try to tighten up our definitions and codify our policy in an informational PEP. What criteria do we use to classify an issue as a security bug requiring a fix, with backports, overriding any backward compatibility breaks? I think we've been largely ad-hoc about this question. One thing I think such an informational PEP must require is a rationale as to why the issue is being classified as a security bug, a backporting rationale and plan, and a "Backwards Compatibility Impact Assessment", which I'm very glad to see in PEP 522. Cheers, -Barry
2016-06-24 16:01 GMT+02:00 Barry Warsaw <barry@python.org>:
One thing I think such an informational PEP must require is a rationale as to why the issue is being classified as a security bug, a backporting rationale and plan, and a "Backwards Compatibility Impact Assessment", which I'm very glad to see in PEP 522.
Sorry, I didn't have time yet to think about Python 2.7 and Python 3.5. But it looks like my PEP (make os.urandom() blocking) and Nick's PEP 522 (os.urandom() can raises BlockingIOError) introduce a backward incompatible change. Applications which worked well on Python 3.5 may block/fail with these changes. I'm not sure that it's worth it to enhance Python 2.7 or 3.5. IMO discussed changes make Python more secure, but they don't really fix a critical vulnerability. I don't think that it's a security vulnerability. I prefer to qualify it as an enhancement, security "hardening" if you pefer. Victor
On 24 June 2016 at 07:01, Barry Warsaw <barry@python.org> wrote:
On Jun 24, 2016, at 12:11 AM, Victor Stinner wrote:
Once we modified Python 3.6 to handle correctly "the bug" and we consider that the implementation is tested enough, I suggest to backport it to Python 2.7 as well. Moreover, I would also suggest to backport the change to Python 3.5, I would be sad if Python 2 is more secure than the latest Python 3 release :-)
This is the fundamental point of disagreement, and I think it points again to a deficiency in our process. Regardless of outcome of this specific case, I think we should try to tighten up our definitions and codify our policy in an informational PEP.
What criteria do we use to classify an issue as a security bug requiring a fix, with backports, overriding any backward compatibility breaks?
I think we've been largely ad-hoc about this question.
PEP 466 aimed to answer it: https://www.python.org/dev/peps/pep-0466/#why-these-particular-changes The most significant sentence in that section is this one: "The key requirement for a feature to be considered for inclusion in this proposal was that it must have security implications beyond the specific application that is written in Python and the system that application is running on." Earlier drafts of the PEP did aim to define that as a standard policy, but Guido nixed that idea, instead requesting that every such security related backport proposal receive its own dedicated PEP. For PEP 466, the limitations of the Python 2.7 standard library were holding back the evolution of network security in general (e.g. by acting as a brake on the adoption of Server-Name-Indication and on servers forcing TLS-only secure connections). For PEP 476, the mismatch between how people assumed the standard library handled HTTPS connections and how it actually did handle them was causing real security vulnerabilities in networked applications. For the PEPs currently under consideration, I don't think the situation is as critical as that - we're talking about a rare situation specific to secret generation on Linux with poorly configured entropy sources, not the core handling of SSL/TLS and HTTPS used by a large proportion of networked applications. That means I believe "Folks that genuinely care about secure secret generation should upgrade to Python 3.6 and a Linux kernel with getrandom() support" is an entirely reasonable position for us to take in this case. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
2016-06-23 2:35 GMT+02:00 Barry Warsaw <barry@python.org>:
Because the os module has traditionally surfaced lower-level operating system functions,
Well, that's not true. os.urandom() is a bad example since it has many implementations depending on the platform. https://haypo-notes.readthedocs.io/pep_random.html#leave-os-urandom-unchange... or https://haypo-notes.readthedocs.io/pep_random.html#operating-system-random-f...
That's also why I advocate simplifying os.urandom() so that it reverts more or less to exposing /dev/urandom to Python. With perhaps a few exceptions, os doesn't provide higher level APIs.
Hum, I modified os.urandom() to use getrandom() to use the private file descriptor and not require the /dev/urandom device. Using a file descriptor has many issues. Tell me if you need more details on these issues. In Python 3.5.2, os.urandom() uses getrandom() on Linux, but only falls back on reading /dev/urandom is getrandom(GRND_NONBLOCK) fails with EAGAIN. I'm not sure that I understand you. Do you want to stop using getrandom()? What about getrandom() on Solaris? And getentropy() on OpenBSD? (And Windows uses CryptGenRandom() ;-))
The point here is that, let's say you're an experienced Linux developer and you know you want to use getrandom(2) in Python. os.getrandom() is exactly that. It's completely analogous to why we provide, e.g. os.chroot() and such.
Even if we modify os.urandom() to make it blocking, adding os.getrandom() makes sense. getrandom() allows also to read /dev/random (not /dev/urandom) without using a FD, getrandom(GRND_NONBLOCK) also gives access to the non-blocking mode. Victor
2016-06-22 0:57 GMT+02:00 Barry Warsaw <barry@python.org>:
I would like to ask for some changes to this proto-PEP.
At a minimum, I think a proper treatment of the alternative where os.urandom() remains (on Linux at least) a thin wrapper around /dev/urandom. We would add os.getrandom() as the low-level interface to the new C lib function,
Ok, done in the version 2 of my PEP
and expose any higher level functionality in the secrets module if necessary.
I didn't add this point to the PEP. Tell me if it should be added. Which kind of function do you imagine? I wrote an example of a helper function to use os.getrandom() or falls back on os.urandom(): https://haypo-notes.readthedocs.io/pep_random.html#leave-os-urandom-unchange... You may reply on my PEPv2 directly ;-) Victor
participants (6)
-
Barry Warsaw
-
Cory Benfield
-
Donald Stufft
-
Ethan Furman
-
Nick Coghlan
-
Victor Stinner