
Hi,
Python has only IDNA 2003 support (international domain names). I'm starting to consider Python's lack of IDNA 2008 support a security issue for DNS lookups and for cert validation. Applications may connect to the wrong IP address and validate the hostname, too. IDNA 2008 is mandatory for German .de domains. See https://bugs.python.org/issue17305
Wrong:
import socket u'straße.de'.encode('idna')
'strasse.de'
socket.gethostbyname(u'straße.de'.encode('idna'))
'72.52.4.119'
Correct:
import idna idna.encode(u'straße.de')
'xn--strae-oqa.de'
socket.gethostbyname(idna.encode(u'straße.de'))
'81.169.145.78'
I neither have time nor expertise to implement IDNA 2008. The ticket 17305 is more than three years old, too.
Christian

Ah, I read recently an article about IDNA: Firefox uses IDNA 2008, Chrome uses IDNA 2003. Depending on the browser, you may or may not reach the domain https://ssz.fr/ :-)
So at least, the issue is not specific to Python.
Is it possible to support both IDNA versions at the same time by default? Or both versions are exclusive?
Article in french: https://linuxfr.org/news/bilan-a-un-an-des-domaines-fr-d-une-et-deux-lettres...
Victor
2016-10-11 17:41 GMT+02:00 Christian Heimes christian@python.org:
Hi,
Python has only IDNA 2003 support (international domain names). I'm starting to consider Python's lack of IDNA 2008 support a security issue for DNS lookups and for cert validation. Applications may connect to the wrong IP address and validate the hostname, too. IDNA 2008 is mandatory for German .de domains. See https://bugs.python.org/issue17305
Wrong:
import socket u'straße.de'.encode('idna')
'strasse.de'
socket.gethostbyname(u'straße.de'.encode('idna'))
'72.52.4.119'
Correct:
import idna idna.encode(u'straße.de')
'xn--strae-oqa.de'
socket.gethostbyname(idna.encode(u'straße.de'))
'81.169.145.78'
I neither have time nor expertise to implement IDNA 2008. The ticket 17305 is more than three years old, too.
Christian _______________________________________________ Security-SIG mailing list Security-SIG@python.org https://mail.python.org/mailman/listinfo/security-sig

On 2016-10-11 17:47, Victor Stinner wrote:
Ah, I read recently an article about IDNA: Firefox uses IDNA 2008, Chrome uses IDNA 2003. Depending on the browser, you may or may not reach the domain https://ssz.fr/ :-)
So at least, the issue is not specific to Python.
Is it possible to support both IDNA versions at the same time by default? Or both versions are exclusive?
Article in french: https://linuxfr.org/news/bilan-a-un-an-des-domaines-fr-d-une-et-deux-lettres...
Yes, Chrome uses a wrong IDNA version, as do other libraries. PyOpenSSL and cryptography are working around the bug by using the idna Python package. It's too bad that nobody has contributed the code back to Python core.
We can have different IDNA variants by giving each one a distinct name, e.g. idna2003, idna2008 etc.
MvL layed out a plan for IDNA support in ticket https://bugs.python.org/issue17305
1. Python should implement both IDNA2008 and UTS#46, and keep IDNA2003 2. "idna" should become an alias for "idna2003". 3. The socket module and all other place that use the "idna" encoding should use "uts46" instead. 4. Pre-existing implementations of IDNA 2008 should be used as inspirations at best; Python will need a new implementation from scratch, one that puts all relevant tables into the unicodedata module if they aren't there already. This is in particular where the idna 0.1 library fails. The implementation should refer to the relevant parts of the specification, to be easily reviewable for correctness.
Christian

On 11 Oct 2016, at 16:41, Christian Heimes christian@python.org wrote:
Hi,
Python has only IDNA 2003 support (international domain names). I'm starting to consider Python's lack of IDNA 2008 support a security issue for DNS lookups and for cert validation. Applications may connect to the wrong IP address and validate the hostname, too. IDNA 2008 is mandatory for German .de domains. See https://bugs.python.org/issue17305
Wrong:
import socket u'straße.de'.encode('idna')
'strasse.de'
socket.gethostbyname(u'straße.de'.encode('idna'))
'72.52.4.119'
Correct:
import idna idna.encode(u'straße.de')
'xn--strae-oqa.de'
socket.gethostbyname(idna.encode(u'straße.de'))
'81.169.145.78'
I neither have time nor expertise to implement IDNA 2008. The ticket 17305 is more than three years old, too.
It should be noted that the security vulnerability here is really only related to “connect to the wrong IP address”, not to “validate the hostname”. The cert validation is not wrong: the user asked to connect to “straße.de http://strasse.de/" and was connected to “strasse.de http://strasse.de/”. The server must still present a certificate that is valid for “strasse.de http://strasse.de/”: if it presents one that is valid for "xn--strae-oqa.de http://strae-oqa.de/” then validation will fail. It is not possible to use this as a MITM vector in that sense, though it is clearly still possible to catch unsuspecting users, in the same way that registering goggle.com http://goggle.com/ and getting a cert for it may confuse users who mis-type google.com http://google.com/. To the best of my knowledge there is no ASCII domain name such that Unicode domain name X encodes to it under IDNA 2003 while a different Unicode domain name Y encodes to it under IDNA 2008, which is the only way I can see there being a security hole with cert validation.
This is nevertheless an annoying bug that does pose some security risks, though I’d consider them minor, and it’s one that I’d like to see fixed for Requests if nothing else. Any users monitoring this list who would like to mitigate it today should IDNA-encode their domain names before passing them to any Python HTTP client that does not depend on the PyPI IDNA library (which is to say, all of them).
Cory
participants (3)
-
Christian Heimes
-
Cory Benfield
-
Victor Stinner