[Python-3000] urllib.quote/unquote behavior?

Oleg Broytmann phd at phd.pp.ru
Fri May 30 14:42:30 CEST 2008


On Fri, May 30, 2008 at 02:19:23PM +0200, Georg Brandl wrote:
> Python 3.0's urllib.quote() and unquote() handle non-ASCII data strangely.
> quote() encodes characters with codepoint < 256 using latin-1, but others
> using utf-8. unquote() decodes everything using latin-1.
> 
> Is the correct behavior to always use utf-8?

   Always UTF-8. See
http://en.wikipedia.org/wiki/Percent-encoding#Current_standard

Oleg.
-- 
     Oleg Broytmann            http://phd.pp.ru/            phd at phd.pp.ru
           Programmers don't die, they just GOSUB without RETURN.


More information about the Python-3000 mailing list