[Twisted-web] Re: [Web-SIG] WSGI woes
Donovan Preston
dp at ulaluma.com
Thu Sep 16 07:13:52 CEST 2004
On Sep 15, 2004, at 7:12 PM, Phillip J. Eby wrote:
> At 06:48 PM 9/15/04 -0400, Peter Hunt wrote:
>> It looks like WSGI is not well received over at twisted.web.
>>
>> http://twistedmatrix.com/pipermail/twisted-web/2004-September/
>> 000644.html
>
> Excerpting from that post:
>
> """The WSGI spec is unsuitable for use with asynchronous servers and
> applications. Basically, once the application callable returns, the
> server (or "gateway" as wsgi calls it) must consider the page finished
> rendering."""
>
> This is incorrect.
As I said in my original post, I hadn't mentioned anything about this
yet because I didn't have a solution or proposal to fix the problem,
which I maintain remains. I will attempt to suggest solutions, but I am
unsure whether they will work or make sense in all environments. Allow
me to explain:
> Here is a simple WSGI application that demonstrates yielding 50 data
> blocks for transmission *after* the "application callable returns".
>
> def an_application(environ, start_response):
> start_response("200 OK", [('Content-Type','text/plain')])
> for i in range(1,51):
> yield "Block %d" % i
>
> This has been a valid WSGI application since the August 8th posting of
> the WSGI pre-PEP.
According to the spec, """The application object must return an
iterable yielding strings.""" Whether the application callable calls
write before returning or yields strings to generate content, the
effect is the same -- there is no way for the application callable to
say "Wait, hang on a second, I'm not ready to generate more content
yet. I'll tell you when I am." This means the only way the application
can pause for network activity is by blocking. For example, a page
which performed an XML-RPC call and transformed the output into HTML
would be required to perform the XML-RPC call synchronously. Or a page
which initiated a telnet session and streamed the results into a web
page would be required to perform reads on the socket synchronously.
The server or gateway, by calling next(), is assuming that the call
will yield a string value, and only a string value.
Of course, Twisted has a canonical way of indicating that a result is
not yet ready, the Deferred. An asynchronous application could yield a
Deferred and an asynchronous server would attach a callback to this
Deferred which invoked the next() method upon resolution. This is how
Nevow handles Deferreds (in Nevow SVN head at
nevow.flat.twist.deferflatten).
However, the WSGI spec says nothing about Deferred and indeed, Deferred
would be useless in the case of another asynchronous server such as
Medusa. I would suggest that WSGI include a simple Deferred
implementation, but WSGI is simply a spec which is not intended to have
any actual code. Thus, one solution would be for the WSGI spec to be
amended to state:
"""The application object must return an iterable yielding strings or
objects implementing the following interface:
def addCallback(callable):
'''Add 'callable' to the list of callables to be invoked when a string
is available. Callable should take a single argument, which will be a
string.'''
The application object must invoke the callable passed to addCallback,
passing a string which will be written to the request.
"""
This places additional burdens upon implementors of WSGI servers or
gateways. In the case of a threaded HTTP server which uses blocking
writes, implementing support for these promises would have to look
something like this:
import Queue
def handle_request(inSocket, outSocket):
... read inSocket, parse the request and dispatch ...
iterable = application(environ, start_response)
try:
while True:
val = iterable.next()
if isinstance(val, str):
outSocket.write(val)
else:
result = Queue.Queue()
val.addCallback(result.put)
outSocket.write(result.get())
except StopIteration:
outSocket.close()
> It may be, however, that Mr. Preston means that applications which
> want to use 'write()' or a similar push-oriented approach to produce
> data cannot do so after the application returns. If so, we should
> discuss that use case further, preferably on the Web-SIG.
And now we come to my other half-baked proposal.
Instead of merely returning a write callable, start_response could
return a tuple of (write, finish) callables. The application would be
free to call write at any time until it calls finish, at which point
calling either callable becomes illegal. Again, the synchronous server
support for this would have to use spin locking in a fashion such as
this:
import threading
def handle_request(inSocket, outSocket):
... read request, dispatch ...
finished = threading.Semaphore()
def start_response(...):
... write headers ...
return outSocket.write, finished.release
iterable = application(environ, start_response)
if iterable is None:
finished.acquire()
# Once we get here, the application is done with the request.
Finally, we come to the task of implementing a server or gateway which
can asynchronously support either asynchronous or blocking
applications. Since there is no way for the server or gateway to know
whether the application object it is about to invoke will block,
starving the main loop and preventing network activity from being
serviced, it must invoke all applications in a new thread or process. A
solution to this would be to require application callables to provide
additional metadata, perhaps via function or object attributes, which
indicate whether they are capable of running in asynchronous, threaded,
or multiprocess environments. Since it's getting late and this message
is getting long I will leave this discussion for another day.
dp
More information about the Web-SIG
mailing list