[issue1712522] urllib.quote throws exception on Unicode URL

Matt Giuca report at bugs.python.org
Thu Jul 22 04:18:55 CEST 2010

Matt Giuca <matt.giuca at gmail.com> added the comment:

If you're going the way of option 2, I would strongly advise against relying on the KeyError. The fact that a KeyError is raised by urllib.quote is not part of it's specification, it's a bug/quirk in the implementation (which is now unlikely to be change, but it's unsafe to rely on it).

Robotparser should encode the string, if and only if it is a unicode string, with ('ascii', 'strict'), catch the UnicodeEncodeError, and raise the TypeError you suggested. This will have precisely the same behaviour as your proposed option 2 (will work fine for byte strings and Unicode strings with ASCII-only characters, but raise a TypeError on Unicode strings with non-ASCII characters) without relying on the KeyError from urllib.quote.


Python tracker <report at bugs.python.org>

More information about the Python-bugs-list mailing list