[Python-checkins] peps: PEP 504: Using the System RNG by default

nick.coghlan python-checkins at python.org
Tue Sep 15 16:29:13 CEST 2015


https://hg.python.org/peps/rev/61d05f14aa37
changeset:   6064:61d05f14aa37
user:        Nick Coghlan <ncoghlan at gmail.com>
date:        Wed Sep 16 00:29:04 2015 +1000
summary:
  PEP 504: Using the System RNG by default

files:
  pep-0504.txt |  337 +++++++++++++++++++++++++++++++++++++++
  1 files changed, 337 insertions(+), 0 deletions(-)


diff --git a/pep-0504.txt b/pep-0504.txt
new file mode 100644
--- /dev/null
+++ b/pep-0504.txt
@@ -0,0 +1,337 @@
+PEP: 504
+Title: Using the System RNG by default
+Version: $Revision$
+Last-Modified: $Date$
+Author: Nick Coghlan <ncoghlan at gmail.com>
+Status: Draft
+Type: Standards Track
+Content-Type: text/x-rst
+Created: 15-Sep-2015
+Python-Version: 3.6
+Post-History: 15-Sep-2015
+
+Abstract
+========
+
+Python currently defaults to using the deterministic Mersenne Twister random
+number generator for the module level APIs in the ``random`` module, requiring
+users to know that when they're performing "security sensitive" work, they
+should instead switch to using the cryptographically secure ``os.urandom`` or
+``random.SystemRandom`` interfaces or a third party library like
+``cryptography``.
+
+Unfortunately, this approach has resulted in a situation where developers that
+aren't aware that they're doing security sensitive work use the default module
+level APIs, and thus expose their users to unnecessary risks.
+
+This isn't an acute problem, but it is a chronic one, and if documentation and
+developer education were going to solve it, they would have done so by now.
+
+In order to provide an eventually pervasive solution to the problem, this PEP
+proposes that Python switch to using the system random number generator by
+default in Python 3.6, and require developers to opt-in to using the
+deterministic random number generator.
+
+To minimise the compatibility break, calling any of the following module level
+functions will count as opting in to using the deterministic random number
+generator for all future calls to module level functions in the random
+module in the same process:
+
+* ``random.seed``
+* ``random.getstate``
+* ``random.setstate``
+
+Proposal
+========
+
+Currently, it is never correct to use the module level functions in the
+``random`` module for security sensitive applications. This PEP proposes to
+change that admonition in Python 3.6+ to instead be that it is not correct to
+use the module level functions in the ``random`` module for security sensitive
+applications if ``random.seed``, ``random.getstate``, or ``random.setstate``
+are ever called in that process.
+
+This PEP further proposes to make it easier to explicitly opt in to using
+either the system random number generator or Python's deterministic PRNG by
+converting the random module to a package that exposes the same top-level API,
+and offering two new subpackages:
+
+* ``random.system``
+* ``random.seedable``
+
+The ``random.system`` submodule would provide the following bound methods of a
+module global ``random.SystemRandom`` instance as module attributes:
+``betavariate``, ``choice``, ``expovariate``, ``gammavariate``, ``gauss``, ``getrandbits``, ``lognormvariate``, ``normalvariate``, ``paretovariate``,
+``randint``, ``random``, ``randrange``, ``sample``, ``shuffle``,
+``triangular``, ``uniform``, ``vonmisesvariate``, ``weibullvariate``
+
+The ``random.seedable`` submodule would provide the same operations, but as
+methods of a ``random.Random`` instance. In addition, it would provide the
+following additional methods which are only meaningful when using a
+deterministic random number generator: ``seed``, ``getstate``, ``setstate``.
+
+Rather than being bound methods of a ``random.Random`` instance as they are
+today, the module level callables in ``random`` itself would change to be
+functions that, by default, delegated to the ``random.SystemRandom`` instance
+in ``random.system``.
+
+Calling any one of ``random.seed``, ``random.getstate``, or ``random.setstate``
+would change the delegation to instead refer to the ``random.Random`` instance
+in ``random.seedable``.
+
+Warning on implicit opt-in
+--------------------------
+
+In Python 3.6, implicitly opting in to the use of the seedable PRNG will emit a
+deprecation warning. This warning will suggest explicitly opting in to either
+the system RNG or the seedable PRNG. Possible wording:
+
+    "DeprecationWarning: Implicitly switching to the seedable PRNG. Consider
+    importing from random.system or random.seedable as appropriate"
+
+Whatever precise wording is chosen should have an answer added to Stack
+Overflow as was done for the custom error message that was added for missing
+parentheses in a call to print [#print]_.
+
+In the first Python 3 release after Python 2.7 switches to security fix only
+mode, the deprecation warning will be upgraded to a RuntimeWarning so it is
+visible by default.
+
+This PEP does *not* propose removing the ability to seed the default RNG used
+process wide - it's not a good idea relative to the alternative of explicitly
+importing from the appropriate submodule (hence the eventually
+visible-by-default warning), but it's also a concern that can be more
+readily addressed on a project-by-project basis.
+
+Documentation changes
+---------------------
+
+The ``random`` module documentation would be updated to move the documentation
+of the ``seed``, ``getstate`` and ``setstate`` interfaces later in the module,
+along with the associated security warning.
+
+The docs would gain a discussion of the respective use cases for the seedable
+PRNG (games, modelling & simulation, software testing) and the system RNG
+(cryptography, security token generation).
+
+Rationale
+=========
+
+Writing secure software under deadline and budget pressures is a hard problem.
+This is reflected in ongoing problems with data breaches involving personally
+identifiable information [#breaches]_, as well as with failures to take
+security considerations into account when new systems, like motor vehicles
+[#uconnect]_, are connected to the internet. Compounding the issue is the fact
+that a lot of the programming advice readily available on the internet [#search]
+simply doesn't take the mathemetical arcana of computer security into account,
+and the fact that defenders have to cover *all* of their potential
+vulnerabilites, as a single mistake can make it possible to subvert other
+defences [#bcrypt]_.
+
+One of the factors that contributes to making this last aspect particularly
+difficult is APIs where using them inappropriately creates a *silent* security
+failure - one where the only way to find out that what you're doing is
+incorrect is for someone reviewing your code to say "that's a potential
+security problem", or for a system you're responsible for to be compromised
+through such an oversight (and your intrusion detection and auditing mechanisms
+are good enough for you to be able to figure out after the event how the
+compromise took place).
+
+This kind of situation is a significant contributor to "security fatigue",
+where developers (often rightly [#owasptopten]_) feel that security engineers
+spend all their time saying "don't do that the easy way, it creates a
+security vulnerability".
+
+As the designers of one of the world's most popular languages [#ieeetopten]_,
+we can help reduce that problem by making the easy way the right way (or at
+least the "not wrong" way) in more circumstances, so developers and security
+engineers can spend more time worrying about mitigating actually interesting
+threats, and less time fighting with default language behaviours.
+
+Discussion
+==========
+
+Why "seedable" over "deterministic"?
+------------------------------------
+
+This is a case where the meaning of a word as specialist jargon conflicts with
+the typical meaning of the word, even though it's *technically* the same.
+
+From a technical perspective, a "deterministic RNG" means that given knowledge
+of the algorithm and the current state, you can reliably compute arbitrary
+future states.
+
+The problem is that "deterministic" on its own doesn't convey those qualifiers,
+so it's likely to instead be interpreted as "predictable" or "not random" by
+folks that aren't familiar with the technical meaning.
+
+The other problem with "deterministic" as a description for the traditional RNG
+is that it doesn't tell you what you can *do* with the traditional RNG that you
+can't do with the system one.
+
+"seedable" aims to address both those problems, as it doesn't have a misleading
+common meaning, and it's a word form that means "you can seed this", which then
+leads naturally into an exploration of what it means to "seed" a random number
+generator.
+
+Only changing the default for Python 3.6+
+-----------------------------------------
+
+Some other recent security changes, such as upgrading the capabilities of the
+``ssl`` module and switching to properly verifying HTTPS certificates by
+default, have been considered critical enough to justify backporting the
+change to all currently supported versions of Python.
+
+The difference in this case is one of degree - the additional benefits from
+rolling out this particular change a couple of years earlier than will
+otherwise be the case aren't sufficient to justify the additional effort and
+stability risks involved in making such an intrusive change in a maintenance
+release.
+
+Keeping the module level functions
+----------------------------------
+
+In additional to general backwards compatibility considerations, Python is
+widely used for educational purposes, and we specifically don't want to
+invalidate the wide array of educational material that assumes the availabilty
+of the current ``random`` module API. Accordingly, this proposal ensures that
+most of the public API can continue to be used not only without modification,
+but without generating any new warnings.
+
+Implicitly opting in to the deterministic RNG
+---------------------------------------------
+
+Python is widely used for modelling and simulation purposes, and in many cases,
+these software models won't have a dedicated maintenance team tasked with
+ensuing they keep working on the latest versions of Python.
+
+Using first DeprecationWarning, and then eventually a RuntimeWarning, to
+advise against implicitly switching to the deterministic PRNG, preserves
+compatibility with this existing software, while still nudging future users
+that need a deterministic generator towards importing ``random.seedable``
+explicitly.
+
+Avoiding the introduction of a userspace CSPRNG
+-----------------------------------------------
+
+The original discussion of this proposal on python-ideas[#csprng]_ suggested
+introducing a cryptographically secure pseudo-random number generator and using
+that by default, rather than defaulting to the relatively slow system random
+number generator.
+
+The problem [#nocsprng]_ with this approach is that it introduces an additional
+point of failure in security sensitive situations, for the sake of applications
+where the random number generation may not even be on a critical performance
+path.
+
+What about the performance impact?
+----------------------------------
+
+Rather than introducing a userspace CSPRNG, this PEP instead proposes that we
+accept the performance regression in cases where:
+
+* an application is using the module level random API
+* cryptographic quality randomness isn't needed
+* the application doesn't already implicitly opt back in to the deterministic
+  PRNG by calling ``random.seed``,  ``random.getstate``,  or ``random.setstate``
+* the application isn't updated to explicitly import from ``random.seedable``
+  rather than ``random``
+
+Applications that need cryptographic quality randomness should be using the
+system random number generator regardless of speed considerations, while other
+applications where speed is a more important consideration are better off with
+the current PRNG implementation than they would be with a new CSPRNG.
+
+Isn't the deterministic PRNG "secure enough"?
+---------------------------------------------
+
+In a word, "No" - that's why there's a warning in the module documentation
+that says not to use it for security sensitive purposes. While we're not
+currently aware of any studies of Python's random number generator specifically,
+studies of PHP's random number generator [#php]_ have demonstrated the ability
+to use weaknesses in that subsystem to facilitate a practical attack on
+password recovery tokens in popular PHP web applications.
+
+Security fatigue in the Python ecosystem
+----------------------------------------
+
+Over the past few years, the computing industry as a whole has been
+making a concerted effort to upgrade the shared network infrastructure we all
+depend on to a "secure by default" stance. As one of the most widely used
+programming languages for network service development (including the OpenStack
+Infrastructure-as-a-Service platform) and for systems administration
+on Linux systems in general, a fair share of that burden has fallen on the
+Python ecosystem, which is understandably frustrating for Pythonistas using
+Python in other contexts where these issues aren't of as great a concern.
+
+This consideration is one of the primary factors driving the backwards
+compatibility improvements in this proposal relative to the initial draft
+concept posted to python-ideas [#draft]_.
+
+Acknowledgements
+================
+
+* Theo de Raadt, for making the suggestion to Guido van Rossum that we
+  seriously consider defaulting to a cryptographically secure random number
+  generator
+* Serhiy Storchaka, Terry Reedy, Petr Viktorin, and anyone else in the
+  python-ideas threads that suggested the approach of transparently switching
+  to the ``random.Random`` implementation when any of the functions that only
+  make sense for a deterministic RNG are called
+* Nathaniel Smith for providing the reference on practical attacks against
+  PHP's random number generator when used to generate password reset tokens
+* Donald Stufft for pursuing additional discussions with network security
+  experts that suggested the introduction of a userspace CSPRNG would mean
+  additional complexity for insufficient gain relative to just using the
+  system RNG directly
+
+References
+==========
+
+.. [#breaches] Visualization of data breaches involving more than 30k records (each)
+   (http://www.informationisbeautiful.net/visualizations/worlds-biggest-data-breaches-hacks/)
+
+.. [#uconnect] Remote UConnect hack for Jeep Cherokee
+   (http://www.wired.com/2015/07/hackers-remotely-kill-jeep-highway/)
+
+.. [#php] PRNG based attack against password reset tokens in PHP applications
+   (https://media.blackhat.com/bh-us-12/Briefings/Argyros/BH_US_12_Argyros_PRNG_WP.pdf)
+
+.. [#search] Search link for "python password generator"
+   (https://www.google.com.au/search?q=python+password+generator)
+
+.. [#csprng] python-ideas thread discussing using a userspace CSPRNG
+   (https://mail.python.org/pipermail/python-ideas/2015-September/035886.html)
+
+.. [#draft] Initial draft concept that eventually became this PEP
+   (https://mail.python.org/pipermail/python-ideas/2015-September/036095.html)
+
+.. [#nocsprng] Safely generating random numbers
+   (http://sockpuppet.org/blog/2014/02/25/safely-generate-random-numbers/)
+
+.. [#ieeetopten] IEEE Spectrum 2015 Top Ten Programming Languages
+   (http://spectrum.ieee.org/computing/software/the-2015-top-ten-programming-languages)
+
+.. [#owasptopten] OWASP Top Ten Web Security Issues for 2013
+   (https://www.owasp.org/index.php/OWASP_Top_Ten_Project#tab=OWASP_Top_10_for_2013)
+
+.. [#print] Stack Overflow answer for missing parentheses in call to print
+   (http://stackoverflow.com/questions/25445439/what-does-syntaxerror-missing-parentheses-in-call-to-print-mean-in-python/25445440#25445440)
+
+.. [#bcrypt] Bypassing bcrypt through an insecure data cache
+   (http://arstechnica.com/security/2015/09/once-seen-as-bulletproof-11-million-ashley-madison-passwords-already-cracked/)
+
+Copyright
+=========
+
+This document has been placed in the public domain.
+
+

+..
+   Local Variables:
+   mode: indented-text
+   indent-tabs-mode: nil
+   sentence-end-double-space: t
+   fill-column: 70
+   coding: utf-8
+   End:

-- 
Repository URL: https://hg.python.org/peps


More information about the Python-checkins mailing list