[Web-SIG] Asynchronous streaming in WSGI

Thu Aug 5 18:19:50 CEST 2004

I've been looking at a possible change to the WSGI protocol to address some 
issues raised by Grisha and Ian.  But I'm not sure that the change is best, 
given the range of existing platforms and applications that may *currently* 
use asynchronous streaming of responses, even though in many ways the 
change would handle asynchronous streaming *better*.

Let me explain.  The previous WSGI proposal was based on an interface like:

     def runCGI(inp,out,err,env):
         # do everything

The modified interface, that I've been playing with in peak.web is:

     def handle_http(env):
         return status_string,header_list,output_iterable

The ideas that changed here are:

* Separate status from headers and output
* Don't require servers to parse headers or create an output buffer
* Allow lengthy output to be streamed *after* the function returns, to 
avoid tying up a task thread in multi-threaded servers
* Allow non-CGI variables (e.g. 'wsgi.input_stream', 'wsgi.error_stream', 
'wsgi.version', 'wsgi.multi_threaded', etc.) in the environment to avoid a 
separate configuration method and simplify chaining of processors

As a result of these changes, it should also be much easier to write 
request preprocessors, response postprocessors, and other kinds of 
intermediaries between the web server and the actual 
application/frameworks, because less parsing and buffering are 
required.  Last, but not least, an interface like this should be easier to 
implement in asynchronous web servers, because they can just invoke 
'iterator.next()' when they need another block to send out.

I think these are improvements in the direction that folks requested, 
*except* for one issue: unbuffered streaming output in existing code can't 
use this.  A prime example is Zope, whose response.write() method does 
streaming output.  Under the revised WSGI, there's nothing to write *to*, 
so such existing code would have to run in a separate thread from the web 
server and communicate via a queue.  This doesn't seem like a great idea.

So, there are several possible ways to deal with this:

1) Stick with the old interface
2) Go with the newer interface, and try to lobby frameworks that support 
this type of "push" to make changes to support it
3) Publish both interfaces, and push for a stdlib module that can convert 
between them
4) Some other idea I haven't thought of  :)

Opinions?  Questions?  Ideas?