[Web-SIG] parsing of urlencoded data and Unicode

Bill Janssen janssen at parc.com
Tue Jul 29 03:25:54 CEST 2008


> The first parse the query string and return a dictionary of strings, the
> latter parse the application/x-www-form-urlencoded client body and
> return a dictionary of strings and the charset used by the client for
> the unicode encoding.

> Now, I'm thinking if these two function should instead return Unicode
> strings instead of plain strings.

I'd say, yes.  I do this in my framework, which also decodes query
strings and post bodies (and handles multipart/form-data as well as
x-www-form-urlencoded).  Note that while x-www-form-urlencoded is
generally restricted to ASCII values by the HTML 4.01 spec,
multipart/form-data can contain arbitrary Unicode strings.

In Python 3.x, strings are all Unicode.

Bill


More information about the Web-SIG mailing list