I've just submitted PEP 476, on enabling certificate validation by default for HTTPS clients in Python. Please have a look and let me know what you think.
PEP text follows.
PEP: 476 Title: Enabling certificate verification by default for stdlib http clients Version: $Revision$ Last-Modified: $Date$ Author: Alex Gaynor firstname.lastname@example.org Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 28-August-2014
Currently when a standard library http client (the ``urllib`` and ``http`` modules) encounters an ``https://%60%60 URL it will wrap the network HTTP traffic in a TLS stream, as is necessary to communicate with such a server. However, during the TLS handshake it will not actually check that the server has an X509 certificate is signed by a CA in any trust root, nor will it verify that the Common Name (or Subject Alternate Name) on the presented certificate matches the requested host.
The failure to do these checks means that anyone with a privileged network position is able to trivially execute a man in the middle attack against a Python application using either of these HTTP clients, and change traffic at will.
This PEP proposes to enable verification of X509 certificate signatures, as well as hostname verification for Python's HTTP clients by default, subject to opt-out on a per-call basis.
The "S" in "HTTPS" stands for secure. When Python's users type "HTTPS" they are expecting a secure connection, and Python should adhere to a reasonable standard of care in delivering this. Currently we are failing at this, and in doing so, APIs which appear simple are misleading users.
When asked, many Python users state that they were not aware that Python failed to perform these validations, and are shocked.
The popularity of ``requests`` (which enables these checks by default) demonstrates that these checks are not overly burdensome in any way, and the fact that it is widely recommended as a major security improvement over the standard library clients demonstrates that many expect a higher standard for "security by default" from their tools.
The failure of various applications to note Python's negligence in this matter is a source of *regular* CVE assignment [#]_ [#]_ [#]_ [#]_ [#]_ [#]_ [#]_ [#]_ [#]_ [#]_ [#]_.
.. [#] https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2010-4340 .. [#] https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2012-3533 .. [#] https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2012-5822 .. [#] https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2012-5825 .. [#] https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2013-1909 .. [#] https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2013-2037 .. [#] https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2013-2073 .. [#] https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2013-2191 .. [#] https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2013-4111 .. [#] https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2013-6396 .. [#] https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2013-6444
Technical Details =================
Python would use the system provided certificate database on all platforms. Failure to locate such a database would be an error, and users would need to explicitly specify a location to fix it.
This can be achieved by simply replacing the use of ``ssl._create_stdlib_context`` with ``ssl.create_default_context`` in ``http.client``.
Trust database --------------
This PEP proposes using the system-provided certificate database. Previous discussions have suggested bundling Mozilla's certificate database and using that by default. This was decided against for several reasons:
* Using the platform trust database imposes a lower maintenance burden on the Python developers -- shipping our own trust database would require doing a release every time a certificate was revoked. * Linux vendors, and other downstreams, would unbundle the Mozilla certificates, resulting in a more fragmented set of behaviors. * Using the platform stores makes it easier to handle situations such as corporate internal CAs.
Backwards compatibility -----------------------
This change will have the appearance of causing some HTTPS connections to "break", because they will now raise an Exception during handshake.
This is misleading however, in fact these connections are presently failing silently, an HTTPS URL indicates an expectation of confidentiality and authentication. The fact that Python does not actually verify that the user's request has been made is a bug, further: "Errors should never pass silently."
Nevertheless, users who have a need to access servers with self-signed or incorrect certificates would be able to do so by providing a context with custom trust roots or which disables validation (documentation should strongly recommend the former where possible). Users will also be able to add necessary certificates to system trust stores in order to trust them globally.
Twisted's 14.0 release made this same change, and it has been met with almost no opposition.
Other protocols ===============
This PEP only proposes requiring this level of validation for HTTP clients, not for other protocols such as SMTP.
This is because while a high percentage of HTTPS servers have correct certificates, as a result of the validation performed by browsers, for other protocols self-signed or otherwise incorrect certificates are far more common. Note that for SMTP at least, this appears to be changing and should be reviewed for a potential similar PEP in the future:
* https://www.facebook.com/notes/protect-the-graph/the-current-state-of-smtp starttls-deployment/1453015901605223 * https://www.facebook.com/notes/protect-the-graph/massive-growth-in-smtp- starttls-deployment/1491049534468526
Python Versions ===============
This PEP proposes making these changes to ``default`` (Python 3) branch. I strongly believe these changes also belong in Python 2, but doing them in a patch-release isn't reasonable, and there is strong opposition to doing a 2.8 release.
This document has been placed into the public domain.
.. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8