[Web-SIG] Request for Comments on upcoming WSGI Changes

Sun Sep 20 16:50:36 CEST 2009

Hi,

P.J. Eby schrieb:
> This discussion has been going on for so long that I've already 
> forgotten what the problem was with just using the original 1.0 spec 
> for 3.X, i.e., using native strings for everything, using latin-1 
> encoding.  The only things I can recall off the top of my head are 
> that the input stream would still be bytes, and that the environment 
> might've used a different encoding.
Django, Pylons, SQLAlchemy, Mako, Jinja2, Genshi, Werkzeug, WebOb and
many more technologies are based on unicode, even in Python 2.x.  They
are currently doing decoding of byte data internally.

In Python 2.x if we stick to native strings for WSGI 2.0 / 1.5 whatever
we suddenly have different code paths for Python 3 and Python 2.
Because in Python 3 we suddendly already have unicode data.

You're assuming a situation where the applicaiton in Python 2.x was byte
based, but in the majority of cases this is never the situation.

> IMO, this strongly suggests that it's the stdlib or Python 3 that's 
> broken here.  How much of the stdlib are we talking about needing to 
> reimplement, aside from cgi.FieldStorage?
I'm already creating a patch for urllib which currently requires
unicode.  I'm not sure about what to do with cgi.FieldStorage, in
general I would not recommend using the cgi module for WSGI applications
at all!  If we would go with bytes for the WSGI 1.0 spec on Python 3 a
WSGI server also has to decode that data from the Server again.

Also (something I haven't yet filed as a bug because I guess there will
be more changes involved) the HTTP server in Python 3.1 does not support
non-ASCII headers.

Regards,
Armin