[Web-SIG] WSGI, Python 3 and Unicode
Phillip J. Eby
pje at telecommunity.com
Fri Dec 7 20:55:47 CET 2007
So here are my recommendations so far for the addendum to WSGI *1.0*
for Python 3.0 (I expect we can be more strict for WSGI 2.0):
* When running under Python 3, applications SHOULD produce bytes
output and headers
* When running under Python 3, servers and gateways MUST accept
strings as application output or headers, under the existing rules
(i.e., s.encode('latin-1') must convert the string to bytes without
an exception)
* When running under Python 3, servers MUST provide CGI HTTP
variables as strings, decoded from the headers using HTTP standard
encodings (i.e. latin-1 + RFC 2047) (Open question: are there any
CGI or WSGI variables that should NOT be strings?)
* When running under Python 3, servers MUST make wsgi.input a binary
(byte) stream
* When running under Python 3, servers MUST provide a text stream for
wsgi.errors
These rules are intended to simplify the porting of existing
code. Notice, for example, that these rules allow middleware to pass
strings through unchanged, since they are not required to produce
bytes output or headers.
Unfortunately, wsgi.input can't be coded around, but for most
frameworks this should be a single point of pain. In fact, if the
'cgi' stdlib module is made compatible with bytes, only the rare
framework that rolls its own multipart parser or otherwise directly
manipulates put/post data will be affected. Code that just takes the
input and writes it to a file won't be bothered, either.
Comments or questions?
More information about the Web-SIG
mailing list