[Baypiggies] urllib.urlencode and encoding

Tung Wai Yip tungwaiyip at yahoo.com
Thu Apr 19 20:14:54 CEST 2007

> On Apr 18, 2007, at 11:17 PM, Keith Dart wrote:
>> On Wed, 18 Apr 2007 21:15:34 -0700
>> David Reid <dreid at dreid.org> wrote:
>>> So I think it's still incorrect for urllib to make any such
>>> assumptions as to the data being UTF-8. (Though I hope it won't be in
>>> the future.)
>> The RFC, and the previous discussion, have nothing to do with the
>> content (data) encoding. It's only concerned with the URL encoding.
> The relevant section of the HTML4 forms spec is concerned with the
> URL encoding if the URL is generated by the browser as part of a form
> submission.  So I'm still gonna have to go with it being pretty much
> completely wrong for urllib to make any assumptions about the charset
> of %-encoded data (either in a url segment or in query args.)  Not
> that life wouldn't be much nicer if everything weren't UTF-8, but the
> world isn't that nice to begin with.
> - -David
> http://dreid.org

Here is an example. The key parameter is BIG-5 encoded. Welcome to the  
tower of babel!


Wai Yip

More information about the Baypiggies mailing list