[issue1712522] urllib.quote throws exception on Unicode URL

Matt Giuca report at bugs.python.org
Mon Jul 19 16:26:33 CEST 2010


Matt Giuca <matt.giuca at gmail.com> added the comment:

OK sure, there are some other things broken, but they are mostly not dealing with string data, but binary data (for example, zlib expects a sequence of bytes, not characters).

Just one quick point:

> urllib.urlretrieve("file:///tmp/hé")
> UnicodeError: URL u'file:///tmp/h\xc3\xa9' contains non-ASCII characters

That's precisely correct behaviour. URLs are not allowed to contain non-ASCII characters (that's the whole point of urllib.quote). urllib.quote should accept non-ASCII characters (for conversion into ASCII strings). Other URL processing functions should not accept non-ASCII characters, since they aren't valid URIs.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue1712522>
_______________________________________


More information about the Python-bugs-list mailing list