[Python-Dev] urllib unicode handling

Jeroen Ruigrok van der Werven asmodai at in-nomine.org
Wed May 7 07:20:16 CEST 2008


-On [20080507 04:06], Tom Pinckney (thomaspinckney3 at gmail.com) wrote:
>While in theory UTF-8 is not a standard, sites like Last.fm, Facebook and 
>Wikipedia seem to have embraced it (as have pretty much all other major web 
>sites). As with HTML, there is what the standard says and what the actual 
>browsers have to accept in order to work in the real world.

I agree with you. The dictionary project I am working on (Dutch <> Japanese)
uses in the URLs UTF-8 characters and things just worked with reasonably new
browsers (at least no problems with Opera 9, Firefox 2 and 3, Internet
Explorer 7 and Safari 3). Then later Armin Ronacher warned me that you still
have to URL-escape these things in order to not be in lala-land.

Would people object if such functionality got added to urllib?

-- 
Jeroen Ruigrok van der Werven <asmodai(-at-)in-nomine.org> / asmodai
イェルーン ラウフロック ヴァン デル ウェルヴェン
http://www.in-nomine.org/ | http://www.rangaku.org/ | GPG: 2EAC625B
If Winter comes, can Spring be far behind..?


More information about the Python-Dev mailing list