[Web-SIG] WSGI for Python 3

Ian Bicking ianb at colorstudy.com
Wed Jul 14 06:43:39 CEST 2010


So... there's been some discussion of WSGI on Python 3 lately.  I'm not
feeling as pessimistic as some people, I feel like we were close but just
didn't *quite* get there.

Here's my thoughts:

* Everyone agrees keys in the environ should be native strings
* Bodies should stay bytes
* Can we make all "standard" values that are str on Python 2, str on Python
3 with a Latin1 encoding?  This is basically what wsgiref did.  This means
HTTP_*, SERVER_NAME, etc.  Everything CGIish, and everything with an
all-caps key.  There's only a couple tricky keys: SCRIPT_NAME, PATH_INFO,
and HTTP_COOKIE.
* I propose we let libraries handle HTTP_COOKIE however they want; don't
bother transcoding *into* the environ, just do so when you parse the cookie
(if you so choose).  Happy developers will just urlencode all their cookie
values to keep their cookies ASCII-clean.  Unhappy developers who have to
handle legacy cookies will just run environ['HTTP_COOKIE'].decode('latin1')
and then do whatever sad magic they are forced to do.
* I (re)propose we eliminate SCRIPT_NAME and PATH_INFO and replace them
exclusively with encoded versions (that represent the original request
URI).  We use Latin1 encoding, but it should be ASCII anyway, like most of
the headers.
* I'm terrible at naming, but let's say these new values are RAW_SCRIPT_NAME
and RAW_PATH_INFO.

Does this solve everything?  There's broken stuff in the stdlib, but we
shouldn't bother ourselves with that -- if we need working code we should
just write it and ignore the stdlib or submit our stuff as patches to the
stdlib.

Some environments will have a hard time constructing RAW_SCRIPT_NAME and
RAW_PATH_INFO, but in my opinion they can just encode SCRIPT_NAME and
PATH_INFO and be done with it; it's not as accurate, but it's no less
accurate than what we have now.

Actual transcoding in the environ is not supported or encouraged in this
scheme.  If you want to adjust an encoding you should do it in your
application/library code.

There's some other topics, like chunked responses, unknown request body
lengths, start_response, and maybe some other things, but these aren't
Python 3 issues, they are just... generic issues.  app_iter.close() might be
worth thinking about given new iterator semantics introduced since WSGI was
written.

-- 
Ian Bicking  |  http://blog.ianbicking.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/web-sig/attachments/20100713/5191abcc/attachment.html>


More information about the Web-SIG mailing list