[issue3300] urllib.quote and unquote - Unicode issues

Bill Janssen report at bugs.python.org
Wed Aug 13 19:05:21 CEST 2008


Bill Janssen <bill.janssen at gmail.com> added the comment:

Erik van der Poel at Google has now chimed in with stats on current URL
usage:

``...the bottom line is that escaped non-utf-8 is still quite prevalent,
enough (in my opinion) to require an implementation in Python, possibly
even allowing for different encodings in the path and query parts (e.g.
utf-8 path and gb2312 query).''

http://lists.w3.org/Archives/Public/www-international/2008JulSep/0042.html

I think it's worth remembering that a very large proportion of the use
of Python's urllib.unquote() is in implementations of Web server
frameworks of one sort or another.  We can't control what the browsers
that talk to such frameworks produce; the IETF doesn't control that,
either.  In this case, "practicality beats purity" is the clarion call
of the browser designers, and we'd better be able to support them.

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue3300>
_______________________________________


More information about the Python-bugs-list mailing list