[Web-SIG] Proposal for asynchronous WSGI variant

Tue May 6 04:09:33 CEST 2008

2008/5/6 Christopher Stawarz <cstawarz at csail.mit.edu>:
> (I'm new to the list, so please forgive me for making my first post a
>  specification proposal :)
>
>  Browsing through the list archives, I see there's been some
>  inconclusive discussions on adding better support for asynchronous web
>  servers to the WSGI spec.  Since such support would be very useful for
>  some upcoming projects of mine, I decided to take a shot at specing
>  out and implementing it.  I'd be grateful for any feedback you have.
>  If this seems like something worth pursuing, I would also welcome
>  collaborators to help develop the spec further.
>
>  The name for this proposed specification is the Asynchronous Web
>  Server Gateway Interface (AWSGI).  As the name suggests, the spec is
>  closely related to WSGI and is most easily described in terms of how
>  it differs from WSGI.  AWSGI eliminates the following parts of WSGI:
>
>   - the environment variables wsgi.version and wsgi.input
>
>   - the write() callable returned by start_response()
>
>  AWSGI adds the following environment variables:
>
>   - awsgi.version
>   - awsgi.input
>   - awsgi.readable
>   - awsgi.writable
>   - awsgi.timeout
>
>  In addition, AWSGI allows the application iterable to yield two types
>  of data:
>
>   - byte strings, handled as in WSGI
>
>   - the result of calling awsgi.readable or awsgi.writable, which
>     indicates that the application should be paused and restarted when
>     a specified file descriptor is ready for reading or writing
>
>  Because of AWSGI's similarity to WSGI, a simple wrapper can be used to
>  run AWSGI applications on WSGI servers without alteration.
>
>  The following example application demonstrates typical usage of AWSGI.
>  This application simply reads the request body and sends it back to
>  the client.  Each time it wants to receive data from the client, it
>  first tests awsgi.input for readability and then calls its recv()
>  method.  If awsgi.input is not readable after one second, the
>  application sends a "408 Request Timeout" response to the client and
>  terminates:
>
>
>   def echo_request_body(environ, start_response):
>       input = environ['awsgi.input']
>       readable = environ['awsgi.readable']
>
>       nbytes = int(environ.get('CONTENT_LENGTH') or 0)
>       output = ''
>       while nbytes:
>           yield readable(input, 1.0)  # Time out after 1 second
>
>           if environ['awsgi.timeout']:
>               msg = 'The request timed out.'
>               start_response('408 Request Timeout',
>                              [('Content-Type', 'text/plain'),
>                               ('Content-Length', str(len(msg)))])
>               yield msg
>               return
>
>           data = input.recv(nbytes)
>           if not data:
>               break
>           output += data
>           nbytes -= len(data)
>
>       start_response('200 OK', [('Content-Type', 'text/plain'),
>                                 ('Content-Length', str(len(output)))])
>       yield output
>
>
>  I have rough but functional implementations of a number of AWSGI
>  components available in a Bazaar branch at
>  http://pseudogreen.org/bzr/awsgiref/.  The package includes an
>  asyncore-based AWSGI server and an AWSGI-to-WSGI application wrapper.
>  In addition, the file spec.txt contains a more detailed description of
>  the specification (which is also appended below).
>
>  Again, I'd very much appreciate comments and criticism.
>
>
>  Thanks,
>  Chris
>
>
>
>
>  Detailed AWSGI Specification
>  ----------------------------
>
>  - Required AWSGI environ variables:
>
>   * All variables required by WSGI, except for wsgi.version and
>     wsgi.input, which must *not* be present
>
>   * awsgi.version => the tuple (1, 0)
>
>   * awsgi.input
>
>     This is an object with one method, recv(bufsize), which behaves
>     like the socket method of the same name (although it doesn't
>     support the optional flags parameter).  Before each call to
>     recv(), the application must test awsgi.input for readability via
>     awsgi.readable.  The result of calling recv() without doing so is
>     undefined.
>
>     (XXX: Should recv() handle EINTR for the application?)
>
>   * awsgi.readable
>   * awsgi.writable
>
>     These are callables with the signature f(fd, timeout=None).  fd is
>     either a file descriptor (i.e. int or long) or an object with a
>     fileno() method that returns a file descriptor.
>
>     timeout has the same semantics as the timeout parameter to
>     select.select().  If the operation times out, awsgi.timeout will
>     be true when the application resumes.
>
>     In addition to checking readiness for reading or writing, servers
>     should also monitor file descriptors for "exceptional" conditions
>     (e.g. out-of-band data) and restart the application if they occur.
>
>   * awsgi.timeout => boolean indicating whether the most recent read
>     or write wait timed out (false if there have been no waits)
>
>  - start_response() must *not* return a write() callable, as this
>   method of providing application output to the server is incompatible
>   with asynchronous execution.
>
>  - The server must accept awsgi.input as input to awsgi.readable,
>   either by providing an actual socket object or by special-case
>   handling (i.e. awsgi.input needn't have a fileno() method, as long
>   as the server handles it as if it did).
>
>  - Applications return iterators, which can yield:
>
>   * a string => sent to client, just as in standard WSGI
>
>   * the result of a call to awsgi.readable or awsgi.writable =>
>     application is resumed when either the file descriptor is ready
>     for reading/writing or the wait times out (in which case,
>     awsgi.timeout will be true)
>
>  - Although AWSGI applications will *not* be directly compatible with
>   WSGI servers, middleware will allow them to run as standard WSGI
>   apps (with all I/O waits returning immediately).
>
>  - AWSGI servers will not support unmodified WSGI applications.  There
>   are several reasons for this:
>
>   - If the app does blocking I/O, it will block the entire server.
>
>   - Calls to the read() method of wsgi.input may fail with
>     EWOULDBLOCK, which an app expecting synchronous I/O probably won't
>     be prepared to deal with.
>
>   - The readline(), readlines(), and __iter__() methods of wsgi.input
>     can require multiple network I/O operations, which is incompatible
>     with asynchronous execution.
>
>   - The write() callable returned by start_response() is inherently
>     incompatible with asynchronous execution.
>
>   Because of these issues, this specification aims for one-way
>   compatibility between AWSGI and WSGI (i.e. the ability to run AWSGI
>   apps on WSGI servers via middleware, but not vice versa).

No time to understand all this, but a few comments.

 If write() isn't to be returned by start_response(), then do away with
 start_response() if possible as per discussions for WSGI 2.0. See:

  http://www.wsgi.org/wsgi/WSGI_2.0

 In other words, perhaps better aligning it to proposals for WSGI 2.0
 and not to WSGI 1.0.

 Also take note of:

  http://www.wsgi.org/wsgi/Amendments_1.0

 and think about how Python 3.0 would affect things.

 I'd also rather it not be called AWSGI as not sufficient distinct from
 WSGI. If you want to pursue this asynchronous style, then be more
 explicitly and call it ASYNC-WSGI and use 'asyncwsgi' tag in environ.

 Graham