[Baypiggies] urllib.urlencode and encoding

David Reid dreid at dreid.org
Thu Apr 19 18:14:49 CEST 2007


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


On Apr 18, 2007, at 11:17 PM, Keith Dart wrote:

> On Wed, 18 Apr 2007 21:15:34 -0700
> David Reid <dreid at dreid.org> wrote:
>
>> So I think it's still incorrect for urllib to make any such
>> assumptions as to the data being UTF-8. (Though I hope it won't be in
>> the future.)
>
> The RFC, and the previous discussion, have nothing to do with the
> content (data) encoding. It's only concerned with the URL encoding.

The relevant section of the HTML4 forms spec is concerned with the  
URL encoding if the URL is generated by the browser as part of a form  
submission.  So I'm still gonna have to go with it being pretty much  
completely wrong for urllib to make any assumptions about the charset  
of %-encoded data (either in a url segment or in query args.)  Not  
that life wouldn't be much nicer if everything weren't UTF-8, but the  
world isn't that nice to begin with.

- -David
http://dreid.org

"Usually the protocol is this: I appoint someone for a task,
which they are not qualified to do.  Then, they have to fight
a bear if they don't want to do it." -- Glyph Lefkowitz




-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (Darwin)

iD8DBQFGJ5PvrsrO6aeULcgRAouTAJ49/rpNFGIxA7rJdR/h8ItKCmszkgCggSua
eXILt7KtfK6+MAEVZRT5Hjs=
=7SSs
-----END PGP SIGNATURE-----


More information about the Baypiggies mailing list