[Python-Dev] [ssl] The weird case of IDNA
Guido van Rossum
guido at python.org
Fri Dec 29 23:46:12 EST 2017
This being a security issue I think it's okay to break 3.6. might even
backport to 3.5 if it's easy?
On Dec 29, 2017 1:59 PM, "Christian Heimes" <christian at python.org> wrote:
> This mail is about internationalized domain names and TLS/SSL. It
> doesn't concern you if you live in ASCII-land. Me and a couple of other
> developers like to change the ssl module in a backwards-incompatible way
> to fix IDN support for TLS/SSL.
> Simply speaking the IDNA standards (internationalized domain names for
> applications) describe how to encode non-ASCII domain names. The DNS
> system and X.509 certificates cannot handle non-ASCII host names. Any
> non-ASCII part of a hostname is punyencoded. For example the host name
> 'www.bücher.de <http://www.xn--bcher-kva.de>' (books) is translated into '
> www.xn--bcher-kva.de'. In
> IDNA terms, 'www.bücher.de <http://www.xn--bcher-kva.de>' is called an
> IDN U-label (unicode) and
> 'www.xn--bcher-kva.de' an IDN A-label (ASCII). Please refer to the TR64
> document  for more information.
> In a perfect world, it would be very simple. We'd only had one IDNA
> standard. However there are multiple standards that are incompatible
> with each other. The German TLD .de demands IDNA-2008 with UTS#46
> compatibility mapping. The hostname 'www.straße.de <http://www.strasse.de>'
> maps to
> 'www.xn--strae-oqa.de'. However in the older IDNA 2003 standard,
> 'www.straße.de <http://www.strasse.de>' maps to 'www.strasse.de', but '
> strasse.de' is a totally
> different domain!
> CPython has only support for IDNA 2003.
> It's less of an issue for the socket module. It only converts text to
> IDNA bytes on the way in. All functions support bytes and text. Since
> IDNA encoding does change ASCII and IDNA-encoded data is ASCII, it is
> also no problem to pass IDNA2008-encoded text or bytes to all socket
> >>> import socket
> >>> import idna # from PyPI
> >>> names = ['straße.de <http://strasse.de>', b'strasse.de', idna.encode('
> straße.de <http://strasse.de>'),
> idna.encode('straße.de <http://strasse.de>').encode('ascii')]
> >>> for name in names:
> ... print(name, socket.getaddrinfo(name, None, socket.AF_INET,
> socket.SOCK_STREAM, 0, socket.AI_CANONNAME)[3:5])
> straße.de <http://strasse.de> ('strasse.de', ('184.108.40.206', 0))
> b'strasse.de' ('strasse.de', ('220.127.116.11', 0))
> b'xn--strae-oqa.de' ('xn--strae-oqa.de', ('18.104.22.168', 0))
> xn--strae-oqa.de ('xn--strae-oqa.de', ('22.214.171.124', 0))
> As you can see, 'straße.de <http://strasse.de>' is canonicalized as '
> strasse.de'. The IDNA
> 2008 encoded hostname maps to a different IP address.
> On the other hand ssl module is currently completely broken. It converts
> hostnames from bytes to text with 'idna' codec in some places, but not
> in all. The SSLSocket.server_hostname attribute and callback function
> SSLContext.set_servername_callback() are decoded as U-label.
> Certificate's common name and subject alternative name fields are not
> decoded and therefore A-labels. The *must* stay A-labels because
> hostname verification is only defined in terms of A-labels. We even had
> a security issue once, because partial wildcard like 'xn*.example.org'
> must not match IDN hosts like 'xn--bcher-kva.example.org'.
> In issue  and PR , we all agreed that the only sensible fix is to
> make 'SSLContext.server_hostname' an ASCII text A-label. But this is an
> backwards incompatible fix. On the other hand, IDNA is totally broken
> without the fix. Also in my opinion, PR  is not going far enough.
> Since we have to break backwards compatibility anyway, I'd like to
> modify SSLContext.set_servername_callback() at the same time.
> - Is everybody OK with breaking backwards compatibility? The risk is
> small. ASCII-only domains are not affected and IDNA users are broken
> - Should I only fix 3.7 or should we consider a backport to 3.6, too?
>  https://www.unicode.org/reports/tr46/
>  https://bugs.python.org/issue28414
>  https://github.com/python/cpython/pull/3010
> Python-Dev mailing list
> Python-Dev at python.org
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Python-Dev