[Web-SIG] Request for Comments on upcoming WSGI Changes

Sun Sep 20 16:43:52 CEST 2009

At 03:06 PM 9/20/2009 +0200, Armin Ronacher wrote:
>Hello everybody,
>
>Thanks to Graham Dumpleton and Robert Brewer there is some serious
>progress on WSGI currently.  I proposed a roadmap with some PEP changes
>now that need some input.
>
>Summary:
>
>   WSGI 1.0       stays the same as PEP 0333 currently is
>   WSGI 1.1       becomes what Ian and I added to PEP 0333
>   WSGI 2.0       becomes a unicode powered version of WSGI 1.1
>   WSGI 3.0       becomes WSGI 2.0 just without start_response

Since there's already a well-established notion of WSGI 2.0 being the 
new calling convention, I would suggest (to avoid confusion) renaming 
your "2.0" to "1.2" or "1.5" or something instead.

>   WSGI 1.0 and 1.1 are byte based and nearly impossible to use on Python
>   3 because of changes in the standard library that no longer work with
>   a byte-only approach.

This is unfortunate, but it should probably be considered a 
bellwether for Python 3 porting in general, alas.  The Python 3 
stdlib *should* work with bytes, and the fact that it does not should 
be treated as a bug in the stdlib rather than something to be worked 
around in WSGI.

>Graham wrote down two questions he wants every major framework developer
>to be answered.  These should guide the way to new WSGI standards:
>
>1. Do we keep bytes everywhere forever in Python 2.X, or try to
>    introduce unicode there at all to at least mirror what changes might
>    be made to make WSGI workable in Python 3.X?

Technically, we are not using bytes but "native" strings, i.e. type 
'str'.  What benefit would introducing unicode produce?

>2. Do we skip WSGI 1.X completely for Python 3.X and go straight to
>    WSGI 2.0 for Python 3.X?

This discussion has been going on for so long that I've already 
forgotten what the problem was with just using the original 1.0 spec 
for 3.X, i.e., using native strings for everything, using latin-1 
encoding.  The only things I can recall off the top of my head are 
that the input stream would still be bytes, and that the environment 
might've used a different encoding.

I don't know if such an approach should actually be *recommended*, 
but having a migration path for WSGI 1.0-> Python 3.X sounds like a 
good idea, if it can be done strictly as errata/clarification of the 
existing spec.  Otherwise, might as well forget the whole thing and 
go straight to the latest and greatest (i.e. what has previously been 
called 2.0 and you're calling 3.0.)

>I added a new question I think should be asked too:
>
>3. Do we skip WSGI 2.0 as specified in the PEP and go straight to
>    WSGI 3.0 and drop start_response?

I suggest skipping straight to the latest and greatest with no 
in-betweens at all, other than errata/clarifications on 1.0.  Having 
lots of variations of a "standard" is a bug, not a feature!

>The following things became pretty clear when playing around with
>various specifications on Python 3:
>
>-  Python 3 no longer implicitly converts between unicode and byte
>    strings.  This covers comparisons, the regular expression engine,
>    all string functions and many modules in the stdlib.
>-  The Python 3 stdlib radically moved to unicode for non unicode things
>    as well (the http servers, http clients, url handling etc.)
>
>-  A byte only version of WSGI appears unrealistic on Python 3 because
>    it would require server and middleware implementors to reimplement
>    parts of the standard library to work on bytes again.

IMO, this strongly suggests that it's the stdlib or Python 3 that's 
broken here.  How much of the stdlib are we talking about needing to 
reimplement, aside from cgi.FieldStorage?