[Python-Dev] RFC: Backport ssl.MemoryBIO and ssl.SSLObject to Python 2.7

Nick Coghlan ncoghlan at gmail.com
Mon Jun 5 23:49:23 EDT 2017


On 6 June 2017 at 10:59, Nathaniel Smith <njs at pobox.com> wrote:
> On Jun 5, 2017 7:01 AM, "Nick Coghlan" <ncoghlan at gmail.com> wrote:
>
> On 5 June 2017 at 21:44, Donald Stufft <donald at stufft.io> wrote:
>> Is pip allowed to use the hypothetical _ensurepip_ssl outside of
>> ensurepip?
>
> Yes, so something like _tls_bootstrap would probably be a better name
> for the helper module (assuming we go down the "private API to
> bootstrap 3rd party SSL modules" path).
>
>
> It seems like there's a risk here that we end up with two divergent copies
> of ssl.py and _ssl.c inside the python 2 tree, and that this will make it
> harder to do future bug fixing and backports. Is this going to cause
> problems? Is there any way to mitigate them?

Aye, I spent some time thinking about potentially viable
implementation architectures, and realised that any "private API"
style solution pretty much *has* to be structured as a monkeypatching
API to be effective. Otherwise there is too much risk of divergence in
things like exception definitions that end up causing cryptic "Why is
my SSL exception handler not catching my SSL connection error?" type
bugs.

The gist of the approach would be that for libraries and
non-bootstrapped applications, their SSL/TLS feature detection code
would ultimately look something like this:

    try:
        # Don't do anything special if we don't need to
       from ssl import MemoryBIO, SSLObject
    expect ImportError:
        # Otherwise fall back to using PyOpenSSL
        try:
            from OpenSSL.SSL import Connection
        except ImportError:
            raise ImportError("Failed to import asynchronous SSL/TLS
support: <details>")

It's currently more complex than that in practice (since PyOpenSSL
doesn't natively offer an emulation of the affected parts of the
standard library's ssl module API), but that's what it would
effectively boil down to at the lowest level of any compatibility
wrapper.

Conveniently, this is *already* what libraries and non-bootstrapped
modules have to do if they're looking to support older Python versions
without requiring PyOpenSSL on newer releases, so there wouldn't
actually be any changes on that front.

By contrast, for bootstrapped applications (including pip) the
oversimplified compatibility import summary would instead look more
like this:

    try:
        # Don't do anything special if we don't need to
        from ssl import MemoryBIO, SSLObject
    expect ImportError:
        # See if we can do a runtime in-place upgrade of the standard library
        try:
            import _tls_bootstrap
        except ImportError:
            # Otherwise fall back to using PyOpenSSL
            try:
                from OpenSSL.SSL import Connection
            except ImportError:
                raise ImportError("Failed to bootstrap asynchronous
SSL/TLS support: <details>")
        else:
            _tls_bootstrap.monkeypatch_ssl()

The reason this kind of approach is really attractive to
redistributors from a customer risk management perspective is that
like gevent's monkeypatching of synchronous networking APIs, it's
*opt-in at runtime*, so the risk of our accidentally inflicting it on
a customer that doesn't want it and doesn't need it is almost exactly
zero - if none of their own code includes the "import _tls_bootstrap;
_tls_bootstrap.monkeypatch_ssl()" invocation and none of their
dependencies start enabling it as an implicit side effect of some
other operation, they'll never even know the enhancement is there.
Instead, the compatibility risks get concentrated in the applications
relying on the bootstrapping API, since the monkeypatching process is
a potentially new source of bugs that don't exist in the more
conventional execution models.

The reason that's still preferable though, is that it means the
monkeypatching demonstrably doesn't have to be 100% perfect: it just
has to be close enough to the way the Python 3 standard library API
works to meet the needs of the applications that actually use it.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


More information about the Python-Dev mailing list