[issue3300] urllib.quote and unquote - Unicode issues

Tom Pinckney report at bugs.python.org
Wed Jul 9 17:20:50 CEST 2008


Tom Pinckney <thomaspinckney3 at gmail.com> added the comment:

I mentioned this is in a brief python-dev discussion earlier this 
spring, but many popular websites such as Wikipedia and Facebook do use 
UTF-8 as their character encoding scheme for the path and argument 
portion of URLs.

I know there's no RFC that says this is what should be done, but in 
order to make urllib work out-of-the-box on as many common websites as 
possible, I think defaulting to UTF-8 decoding makes a lot of sense. 

Possibly allow an option charset argument to be passed into quote and 
unquote, but default to UTF-8 in the absence of an explicit character 
set being passed in?

----------
nosy: +thomaspinckney3

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue3300>
_______________________________________


More information about the Python-bugs-list mailing list