[Web-SIG] WSGI Open Space @ PyCon.
mark.mchristensen at gmail.com
Sat Mar 28 13:18:44 CET 2009
On Sat, Mar 28, 2009 at 2:53 AM, Graham Dumpleton
<graham.dumpleton at gmail.com> wrote:
> 2009/3/28 Mark Ramm <mark.mchristensen at gmail.com>:
>> My thought is that we should do a couple things to the wsgi standard,
>> and then anything like the lifecycle methods gets addresse,d it should
>> be pushed into a "container" standard or something.
>> I think Robert Brewer's WSGI Service Bus proposal that he made a
>> couple years ago at PyCon needs a new name, but it does provide a good
>> start on the lifecycle stuff.
> From memory, my concern over that specification was that it sort of
> assumed that applications were all preloaded. I am not sure how well
> it would work where lazy loading is performed and where there are
> multiple WSGI applications running in a interpreter but where they
> weren't themselves mounted within a WSGI application, but through
> external mechanisms dictated by the WSGI hosting mechanism.
>> As for WSGI itself, we should make a couple of smaller changes which I
>> think will likely be a bit easier to quantify and agree on. I'm sure
>> lots more folks from yesterday's discussion will chip in here, but
>> this is my take on the things we discussed.
>> 1) We should drop the start_response callable, and return a three
>> member tupple from the wsgi callable:
>> def wsgi2app(environ):
>> return (status_code, headers, response_iterator)
>> 2) We should turn wsgi.input into an iterator rather than a somewhat
>> file-like object. WSGI middleware that reads part of the wsgi.input
>> iterator should make sure to restore it using itertools.chain or
>> replace it with whatever. If there's a content length specified from
>> the server the middleware should be responsible for maintaining or
>> deleting that information as nessisary. Content length of 0 is
>> allowed and means there's no data, whereas an unspecified or content
>> length, indicates that the value is unknown. This will create a good
>> symmetry between the input and output methods, and seems like a good
>> comprimise between flexibility for middleware creators, and ease of
>> use for consumers.
> The problem with an iterator/generator is how do you control the size
> of the chunks of data returned. An iterator also probably isn't going
> to make chunked request content any easier to handle.
> It may be easier to change how people use the wsgi.input that exists
> now. First off allow one to say:
> to get all input, rather than passing CONTENT_LENGTH as argument.
> For consume all data in chunks until exhausted, require a proper eof
> indicator in the form of an empty string read, then can say:
> s = wsgi.input.read(BLOCKSIZE)
> while s:
> # do something with 's'
> s = wsgi.input.read(BLOCKSIZE)
> That way you don't have to make around with checking how much you have read.
> This does require that an exception be raised if client closes
> connection before all data expected was read.
> The question thus is, what would be the actual benefits of changing to
> an iterator/generator.
>> 3) The server should encode the headers and include explicit
>> information about the encoding in the wsgi environ variable. So that
>> any assumptions about what they bytes in the headers represent is made
> That could be fun. For Apache/mod_wsgi at least you are in control of
> the conversion. In Python 3.0 and CGI/WSGI the os.environ variables
> are already unicode strings because they were converted by Python. How
> this is done varies between UNIX and Windows platforms.
>> I think we're all very sold on item 1, and items 2 and 3 require more
>> thinking, but seemed reasonable to those present at the discussion
>> this afternoon. Hopefully we'll be meeting again on Saturday and
>> will be able to continue to think through this stuff and push this all
>> forward some more.
>> I'm sure there also be several other minor tweeks to the spec like:
> Yeah, like defining how wsgi.file_wrapper should behave where response
> Content-Length is defined but wrapped file actually provides more
> content than that.
>> * Not de-encoding encoded slashes in path strings, so that
>> applications can tell the difference between path separators and
>> encoded slashes.
> When sitting on top of Apache, whether it be mod_wsgi, fastcgi, scgi,
> ajp or CGI, you don't really have much choice, you get what Apache
> gives you.
Which is fine, I guess, but it does make it impossible to tell the
difference between real slashes and encoded ones in WSGI application
code. I would love it if there were some way around that.
>> * adding a "ClientWentAway" exception that indicates that wsgi.imput
>> has not been officially exhausted, but that the client went away before
>> wsgi.input was fully populated.
> The problem with an exception is what namespace do you put it in. You
> almost need to have the type as part of the WSGI environment. You may
> just be better standardising it by saying that an IOError must be
> raised and leave it at that. At the moment most stuff doesn't even pay
> attention to the fact that an exception could occur for some WSGI
That would be fine with me. The issue is definitely not the
exception name, but the fact that one can be raised/caught in a
More information about the Web-SIG