[Web-SIG] Proposed specification: waiting for file descriptor events

Christopher Stawarz cstawarz at csail.mit.edu
Thu May 22 18:30:47 CEST 2008


On May 21, 2008, at 1:34 PM, Manlio Perillo wrote:

>>  Instead, the spec recommends that async servers pre-read the  
>> request body
>>  before invoking the app (either by default or as a configurable  
>> option).
>
> This is the best solution most of the time (but not for all of the  
> time), especially if the "server" can do some "pre-parsing" of  
> multipart/form-data request body.
>
> In fact I plan to write a custom function (in C for Nginx) that will  
> "reduce", as an example:
>
>   Content-Type: multipart/form-data; boundary=AaB03x
>
>   --AaB03x
>   Content-Disposition: form-data; name="submit-name"
>
>   Larry
>   --AaB03x
>   Content-Disposition: form-data; name="files"; filename="file1.txt"
>   Content-Type: text/plain
>
>   ... contents of file1.txt ...
>   --AaB03x--
>
> to (not properly escaped):
>
> Content-Type: application/x-www-form-urlencoded
>
> submit-name=Larry&files.filename=file1.txt&files.ctype=text/ 
> plain&files.path=xxx
>
>
> and the contents of file1.txt will be saved to a temporary file 'xxx'.

It seems like you're making this more complicated than it needs to  
be.  Why not just store the entire request body in a temporary file,  
and then pass an open handle to it as wsgi.input?  That way, the  
server doesn't have to rewrite the request, and the application  
doesn't need to know how to interpret the files.* parameters.

> 1) Why not add a more generic poll like interface?

Because such an interface would be more complicated than what I've  
proposed and harder for server authors to implement.  Also, I'm not  
sure that it gains you much.

Note that I'm not 100% sure on this, as I tried to indicate in the  
"Open Issues" section of my proposal.  The approach I'd like to take  
is to try writing apps with my interface for a while, and if real- 
world usage shows that a poll-like interface would be very useful (or  
necessary), then the spec could be extended to add one.  I think this  
is a safe route, since the readable/writable functions could easily be  
implemented in terms of a more generic poll-like interface, so  
existing apps that use the fdevent extensions would continue to work.

>   Moreover IMHO storing a timeout variable in the environ to check if
>   the previous call timedout, is not the best solution.

I think it's a simple and effective solution.  Server authors don't  
need to implement any new functions or data types.  They just create  
and hold on to a mutable object instance (the simplest being a list  
instance) for each app instance and toggle its truth value as required.

>   In my implementation I return a function, but with generators in
>   Python 2.5 this can be done in a better way.

What advantage does this have over what I've proposed?

> 2) In Nginx it is not possible to simply handle "plain" file
>   descriptors, since these are wrapped in a connection structure.
>
>   This is the reason why I had to add a connection_wrapper function in
>   my WSGI module for Nginx.

But the connection structure just wraps an integer file descriptor,  
right?  So the readable/writable functions can create the required  
wrapper to register with nginx. There's no reason to make the  
application author do it.

> 3) If you read an example that implements a database connection pool:
> http://hg.mperillo.ath.cx/nginx/mod_wsgi/file/tip/examples/nginx-postgres-async.py
>
>   you can see that there is a problem.
>
>   In fact the pool is not very flexible; the application can not  
> handle
>   more than POOL_SIZE concurrent requests.
>
>   However it is possible to just have a new request to wait until a
>   previous connection is free (or a timeout occurs).
>
>   I have attached an example (it is not in the repository since there
>   are some problems).
>
>   The examples use a new extension:
>
>     - ctx = environ['ngx.request_context']()
>     - ctx.resume()
>
>   ctx.resume() "asynchronously" resumes the given request
>   (it will be resumed as soon as control returns to Nginx, when the
>    application yields something).
>
>
>   Note that the problem of resuming another request is easily solved
>   with greenlets, without the need to new extensions
>   (this is one of the reason why I like greenlets).

Right, you want something like Queue.Queue, but for exchanging data  
between request handlers in the same thread.  Since this is a  
different problem from waiting on file descriptors, it's outside the  
scope of my proposal.  However, one way you might implement something  
like this using my proposal would be to run the connection-pool  
manager in a separate thread, and have request handlers talk to it  
over sockets.  Kind of ugly, but I think it would do the job.


Chris


More information about the Web-SIG mailing list