[Web-SIG] parsing of urlencoded data and Unicode
Manlio Perillo
manlio_perillo at libero.it
Tue Jul 29 08:06:17 CEST 2008
Bill Janssen ha scritto:
>> In wsgix I use utf-8 for decoding the QUERY_STRING, and the charset
>> specified in the POST'ed data (utf-8 or the charset found in the special
>> _charset_ field).
>
> That's probably wrong. We went through this recently on the
> python-dev list. While it's possible to tell the encoding of
> multipart/form-data,
With multipart/form-data the problem should be the same.
The content type is defined only for file fields.
> the query_string and x-www-form-urlencoded data
> may be in arbitary character set encodings (see RFC 3986). It's
> probably best to not try to map them to strings; instead, return byte
> arrays for the value, and only return strings for data that can be
> correctly decoded. Otherwise, you lose information that the app
> cannot recover.
>
Interesting, thanks.
I have read Django code and, as far as I can tell, it always decode data
to strings, but using "replace" error handling.
Can you point me to the discussion on python-dev list?
> Bill
>
Manlio Perillo
More information about the Web-SIG
mailing list