[Web-SIG] WSGI, Python 3 and Unicode
ianb at colorstudy.com
Fri Dec 7 05:00:02 CET 2007
Phillip J. Eby wrote:
> At 08:08 PM 12/6/2007 -0500, Adam Atlas wrote:
>> On 6 Dec 2007, at 18:13, Graham Dumpleton wrote:
>>> In Python 3 the default for string type objects will effectively be
>>> Unicode. Is WSGI going to be made to somehow cope with that, or will
>>> application instead be required to return byte string objects instead?
>> I'd say it would be best to only accept `bytes` objects; anything else
>> would require some guesswork. Maybe, at most, it could try to encode
>> returned Unicode objects as ISO-8859-1, and have it be an error if
>> that's not possible.
> Actually, I'd prefer to look at it the other way around: a Python 3
> WSGI server or middleware *may* accept bytes objects instead of str.
> This is relatively easy for the response side of things, but the
> request side is rather more difficult, since wsgi.input may need to
> be binary rather than text mode. (I think we can reasonably assume
> that wsgi.errors is a text mode stream, and should support a
> reasonable encoding.)
wsgi.input definitely seems like it should be bytes to me. Unless we
want to put the encoding process into the server. Not entirely
infeasible, but a bit of a strain. And the request body might very well
be binary, e.g., on a PUT.
The CGI keys in the environment don't feel at all like bytes to me, but
then they aren't unicode either. They can be unicode, again given a bit
of work on the server side. Though unfortunately browsers are very poor
at indicating their encoding for requests, and it ends up being policy
and configuration as much as anything that determines the encoding of
stuff like wsgi.input. I believe all request paths are UTF8 (?), but
I'm not sure about QUERY_STRING. I'm a little fuzzy on some of the
The actual response body should also be bytes. Unless again we want to
introduce upstream encoding.
This does make everything feel more complicated.
Ian Bicking : ianb at colorstudy.com : http://blog.ianbicking.org
More information about the Web-SIG