[docs] [issue36868] New behavior of OpenSSL hostname verification not exposed, incorrectly documented
report at bugs.python.org
Thu May 9 14:15:47 EDT 2019
New submission from Josh Rosenberg <shadowranger+python at gmail.com>:
The What's New in 3.7 docs mention the change from #31399 to use OpenSSL's built-in hostname verification ( https://docs.python.org/3/whatsnew/3.7.html#ssl ), but aside from that, information about the change is largely undiscoverable and/or wrong.
1. The What's New docs repeatedly mention SSLContext.host_flags as the means of modifying behavior. The actual property is underscore prefixed though, as SSLContext._host_flags. Since SSLContext supports the creation of arbitrary names via __dict__, assigning to context.host_flags silently "works", it just fails to *do* anything (nothing ever reads it).
2. None of the flags are documented anywhere; the only way to discover them is to import _ssl (they're not exposed on ssl itself, just the internal C extension), then scan through the exposed names (they're all prefixed with HOSTFLAG_ AFAICT)
3. All of the flags are raw numeric values, but it seems like they should be IntEnums, like the other flags exposed by SSL (among other things, it would make it much easier to interpret the default _host_flags (currently it's just 4, when it could display as <HostFlags.HOSTFLAG_NO_PARTIAL_WILDCARDS: 4>)
4. Nothing about this change, _host_flags/host_flags, or the values of the flags themselves is mentioned on the ssl docs at all.
5. This unintentionally made a behavioral change (one that bit me, and may bite other folks using docker swarm, NETBIOS hostnames, etc.). Python's match_hostname implementation was fine with host names containing underscores (e.g. if the cert was wildcarded to *.example.com, it would match a_b_c.example.com just fine); they're not technically legal by the strict reading of the specs for host names (they're apparently legal for domain names, but not host names, which differ in ways I don't fully understand), but stuff like docker swarm names their services that way automatically, most (all?) browsers support visiting them, etc. It looks like OpenSSL (at least the 1.1.0g my Python 3.7.2 was built against) treats underscores as unmatchable, so any attempt to connect to such a host name in Python 3.7.2 dies with a SSLCertVerificationError, claiming a "Hostname mismatch, certificate is not valid for 'name_with_underscores.example.com'."
I discovered all this because 3.7 broke some scripts I use to connect to docker swarm services. Before I realized the issue was underscores, I was trying to figure out how to tweak the host name checks (assuming maybe something was broken with wildcard matching), and stumbled across all the other issues with the docs, the lack of flag definition exposure, etc.
For the record, I think it's reasonable to require legal host names (it was easy enough to fix for my case; I just updated our docker DNS server to provide aliases using only hyphens and changed the script to use the alias host names), but it would be nice if it was explicitly documented, and ideally, that Python itself recognize that underscores won't work and explicitly raise an exception saying why, rather than letting OpenSSL perform the rejection with a (to someone who doesn't know about the underscore issue) confusing error message.
assignee: docs at python
components: Documentation, Extension Modules, Library (Lib), SSL
nosy: christian.heimes, docs at python, josh.r, vstinner
title: New behavior of OpenSSL hostname verification not exposed, incorrectly documented
versions: Python 3.7, Python 3.8
Python tracker <report at bugs.python.org>
More information about the docs