[New-bugs-announce] [issue36742] urlsplit doesn't accept a NFKD hostname with a port number

Chihiro Ito report at bugs.python.org
Sat Apr 27 08:30:16 EDT 2019


New submission from Chihiro Ito <hokousya at sourcewalker.com>:

urllib.parse.urlsplit raises an exception for an url including a non-ascii hostname in NFKD form and a port number.

example:
>>> urlsplit('http://\u30d5\u309a:80')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/ito/.maltybrew/deen/lib/python3.7/urllib/parse.py", line 437, in urlsplit
    _checknetloc(netloc)
  File "/Users/ito/.maltybrew/deen/lib/python3.7/urllib/parse.py", line 407, in _checknetloc
    "characters under NFKC normalization")
ValueError: netloc 'プ:80' contains invalid characters under NFKC normalization
>>> urlsplit('http://\u30d5\u309a')
SplitResult(scheme='http', netloc='プ', path='', query='', fragment='')
>>> urlsplit(unicodedata.normalize('NFKC', 'http://\u30d5\u309a:80'))
SplitResult(scheme='http', netloc='プ:80', path='', query='', fragment='')

I believe this behavior was introduced at Python 3.7.3. Python 3.7.2 doesn't raise any exception for these lines.

----------
components: Unicode
messages: 340983
nosy: ezio.melotti, hokousya, vstinner
priority: normal
severity: normal
status: open
title: urlsplit doesn't accept a NFKD hostname with a port number
versions: Python 3.7

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue36742>
_______________________________________


More information about the New-bugs-announce mailing list