[Web-SIG] WSGI Open Space @ PyCon.

Graham Dumpleton graham.dumpleton at gmail.com
Sat Mar 28 07:53:25 CET 2009


2009/3/28 Mark Ramm <mark.mchristensen at gmail.com>:
> My thought is that we should do a couple things to the wsgi standard,
> and then anything like the lifecycle methods gets addresse,d it should
> be pushed into a "container" standard or something.
>
> I think Robert Brewer's WSGI Service Bus proposal that he made a
> couple years ago at PyCon needs a new name, but it does provide a good
> start on the lifecycle stuff.

>From memory, my concern over that specification was that it sort of
assumed that applications were all preloaded. I am not sure how well
it would work where lazy loading is performed and where there are
multiple WSGI applications running in a interpreter but where they
weren't themselves mounted within a WSGI application, but through
external mechanisms dictated by the WSGI hosting mechanism.

> As for WSGI itself, we should make a couple of smaller changes which I
> think will likely be a bit easier to quantify and agree on. I'm sure
> lots more folks from yesterday's discussion will chip in here, but
> this is my take on the things we discussed.
>
> 1) We should drop the start_response callable, and return a three
> member tupple from the wsgi callable:
>
>   def wsgi2app(environ):
>         ....
>         return (status_code, headers, response_iterator)
>
> 2) We should turn wsgi.input into an iterator rather than a somewhat
> file-like object.   WSGI middleware that reads part of the wsgi.input
> iterator should make sure to restore it using itertools.chain or
> replace it with whatever.  If there's a content length specified from
> the server the middleware should be responsible for maintaining or
> deleting that information as nessisary.   Content length of 0 is
> allowed and means there's no data, whereas an unspecified or content
> length, indicates that the value is unknown.  This will create a good
> symmetry between the input and output methods, and seems like a good
> comprimise between flexibility for middleware creators, and ease of
> use for consumers.

The problem with an iterator/generator is how do you control the size
of the chunks of data returned. An iterator also probably isn't going
to make chunked request content any easier to handle.

It may be easier to change how people use the wsgi.input that exists
now. First off allow one to say:

  wsgi.input.read()

to get all input, rather than passing CONTENT_LENGTH as argument.

For consume all data in chunks until exhausted, require a proper eof
indicator in the form of an empty string read, then can say:

  s = wsgi.input.read(BLOCKSIZE)
  while s:
    # do something with 's'
    s = wsgi.input.read(BLOCKSIZE)

That way you don't have to make around with checking how much you have read.

This does require that an exception be raised if client closes
connection before all data expected was read.

The question thus is, what would be the actual benefits of changing to
an iterator/generator.

> 3) The server should encode the headers and include explicit
> information about the encoding in the wsgi environ variable.  So that
> any assumptions about what they bytes in the headers represent is made
> explicit.

That could be fun. For Apache/mod_wsgi at least you are in control of
the conversion. In Python 3.0 and CGI/WSGI the os.environ variables
are already unicode strings because they were converted by Python. How
this is done varies between UNIX and Windows platforms.

> I think we're all very sold on item 1, and items 2 and 3 require more
> thinking, but seemed reasonable to those present at the discussion
> this afternoon.    Hopefully we'll be meeting again on Saturday and
> will be able to continue to think through this stuff and push this all
> forward some more.
>
> I'm sure there also be several other minor tweeks to the spec like:

Yeah, like defining how wsgi.file_wrapper should behave where response
Content-Length is defined but wrapped file actually provides more
content than that.

> * Not de-encoding encoded slashes in path strings, so that
> applications can tell the difference between path separators and
> encoded slashes.

When sitting on top of Apache, whether it be mod_wsgi, fastcgi, scgi,
ajp or CGI, you don't really have much choice, you get what Apache
gives you.

> * adding a "ClientWentAway" exception that indicates that wsgi.imput
> has not been officially exausted, but that the client went away before
> wsgi.input was fully populated.

The problem with an exception is what namespace do you put it in. You
almost need to have the type as part of the WSGI environment. You may
just be better standardising it by saying that an IOError must be
raised and leave it at that. At the moment most stuff doesn't even pay
attention to the fact that an exception could occur for some WSGI
adapters.

> I'm sure there are more.   It might also be interesting to look at
> Rack and Jack the ruby and javascript implementations of the WSGI
> idea:
>
> http://jackjs.org/
> http://rack.rubyforge.org/doc/files/SPEC.html
>
> --Mark Ramm
>
>
> On Fri, Mar 27, 2009 at 5:33 PM, Graham Dumpleton
> <graham.dumpleton at gmail.com> wrote:
>> 2009/3/28 Alan Kennedy <alan at xhaus.com>:
>>> Dear all,
>>>
>>> For those of you at PyCon, there is a WSGI Open Space @ 5pm today (Friday).
>>>
>>> The sub-title of the open space is "Does WSGI need revision"?
>>>
>>> An example: Philip Jenvey (http://dunderboss.blogspot.com/) raised the
>>> need for something akin to what Java folks call "Lifecycle methods",
>>> so that WSGI apps can do initialization and finalization.
>>>
>>> http://java.sun.com/j2ee/tutorial/1_3-fcs/doc/Servlets4.html
>>>
>>> I'm sure there are plenty of other topics that could be discussed as well.
>>>
>>> See you @5pm.
>>
>> Please, whatever you do, do not go making anything like this, or even
>> a standard request/response object a part of the WSGI standard.
>>
>> Create a new specification for this 'application level' stuff which is
>> distinct from WSGI and leave WSGI as being the 'server gateway
>> interface' as it is really meant to be.
>>
>> This should go as far as coming up with a better middleware
>> abstraction for the application layer and discouraging people from
>> using WSGI middleware as they exist now.
>>
>> All these new components, although the reference implementation may
>> host on top of WSGI, should other wise hide WSGI thereby allowing them
>> to be hosted on top of an alternate interface or a newer revision of
>> WSGI, such as the minimal revision talked about for WSGI 2.0.
>>
>> If this stuff is all pushed into the WSGI specification then it will
>> be a backward step as far as improving the situation for Python as far
>> as web hosting availability.
>>
>> I have been trying to put together a blog entry saying just this and
>> other things about the role of WSGI, but just haven't had the time.
>> Since sprint almost starting, probably will not get a chance now.
>>
>> Graham
>> _______________________________________________
>> Web-SIG mailing list
>> Web-SIG at python.org
>> Web SIG: http://www.python.org/sigs/web-sig
>> Unsubscribe: http://mail.python.org/mailman/options/web-sig/mark.mchristensen%40gmail.com
>>
>
>
>
> --
> Mark Ramm-Christensen
> email: mark at compoundthinking dot com
> blog: www.compoundthinking.com/blog
>


More information about the Web-SIG mailing list