[Web-SIG] Proposal for asynchronous WSGI variant
Christopher Stawarz
cstawarz at csail.mit.edu
Tue May 6 03:30:27 CEST 2008
(I'm new to the list, so please forgive me for making my first post a
specification proposal :)
Browsing through the list archives, I see there's been some
inconclusive discussions on adding better support for asynchronous web
servers to the WSGI spec. Since such support would be very useful for
some upcoming projects of mine, I decided to take a shot at specing
out and implementing it. I'd be grateful for any feedback you have.
If this seems like something worth pursuing, I would also welcome
collaborators to help develop the spec further.
The name for this proposed specification is the Asynchronous Web
Server Gateway Interface (AWSGI). As the name suggests, the spec is
closely related to WSGI and is most easily described in terms of how
it differs from WSGI. AWSGI eliminates the following parts of WSGI:
- the environment variables wsgi.version and wsgi.input
- the write() callable returned by start_response()
AWSGI adds the following environment variables:
- awsgi.version
- awsgi.input
- awsgi.readable
- awsgi.writable
- awsgi.timeout
In addition, AWSGI allows the application iterable to yield two types
of data:
- byte strings, handled as in WSGI
- the result of calling awsgi.readable or awsgi.writable, which
indicates that the application should be paused and restarted when
a specified file descriptor is ready for reading or writing
Because of AWSGI's similarity to WSGI, a simple wrapper can be used to
run AWSGI applications on WSGI servers without alteration.
The following example application demonstrates typical usage of AWSGI.
This application simply reads the request body and sends it back to
the client. Each time it wants to receive data from the client, it
first tests awsgi.input for readability and then calls its recv()
method. If awsgi.input is not readable after one second, the
application sends a "408 Request Timeout" response to the client and
terminates:
def echo_request_body(environ, start_response):
input = environ['awsgi.input']
readable = environ['awsgi.readable']
nbytes = int(environ.get('CONTENT_LENGTH') or 0)
output = ''
while nbytes:
yield readable(input, 1.0) # Time out after 1 second
if environ['awsgi.timeout']:
msg = 'The request timed out.'
start_response('408 Request Timeout',
[('Content-Type', 'text/plain'),
('Content-Length', str(len(msg)))])
yield msg
return
data = input.recv(nbytes)
if not data:
break
output += data
nbytes -= len(data)
start_response('200 OK', [('Content-Type', 'text/plain'),
('Content-Length', str(len(output)))])
yield output
I have rough but functional implementations of a number of AWSGI
components available in a Bazaar branch at
http://pseudogreen.org/bzr/awsgiref/. The package includes an
asyncore-based AWSGI server and an AWSGI-to-WSGI application wrapper.
In addition, the file spec.txt contains a more detailed description of
the specification (which is also appended below).
Again, I'd very much appreciate comments and criticism.
Thanks,
Chris
Detailed AWSGI Specification
----------------------------
- Required AWSGI environ variables:
* All variables required by WSGI, except for wsgi.version and
wsgi.input, which must *not* be present
* awsgi.version => the tuple (1, 0)
* awsgi.input
This is an object with one method, recv(bufsize), which behaves
like the socket method of the same name (although it doesn't
support the optional flags parameter). Before each call to
recv(), the application must test awsgi.input for readability via
awsgi.readable. The result of calling recv() without doing so is
undefined.
(XXX: Should recv() handle EINTR for the application?)
* awsgi.readable
* awsgi.writable
These are callables with the signature f(fd, timeout=None). fd is
either a file descriptor (i.e. int or long) or an object with a
fileno() method that returns a file descriptor.
timeout has the same semantics as the timeout parameter to
select.select(). If the operation times out, awsgi.timeout will
be true when the application resumes.
In addition to checking readiness for reading or writing, servers
should also monitor file descriptors for "exceptional" conditions
(e.g. out-of-band data) and restart the application if they occur.
* awsgi.timeout => boolean indicating whether the most recent read
or write wait timed out (false if there have been no waits)
- start_response() must *not* return a write() callable, as this
method of providing application output to the server is incompatible
with asynchronous execution.
- The server must accept awsgi.input as input to awsgi.readable,
either by providing an actual socket object or by special-case
handling (i.e. awsgi.input needn't have a fileno() method, as long
as the server handles it as if it did).
- Applications return iterators, which can yield:
* a string => sent to client, just as in standard WSGI
* the result of a call to awsgi.readable or awsgi.writable =>
application is resumed when either the file descriptor is ready
for reading/writing or the wait times out (in which case,
awsgi.timeout will be true)
- Although AWSGI applications will *not* be directly compatible with
WSGI servers, middleware will allow them to run as standard WSGI
apps (with all I/O waits returning immediately).
- AWSGI servers will not support unmodified WSGI applications. There
are several reasons for this:
- If the app does blocking I/O, it will block the entire server.
- Calls to the read() method of wsgi.input may fail with
EWOULDBLOCK, which an app expecting synchronous I/O probably won't
be prepared to deal with.
- The readline(), readlines(), and __iter__() methods of wsgi.input
can require multiple network I/O operations, which is incompatible
with asynchronous execution.
- The write() callable returned by start_response() is inherently
incompatible with asynchronous execution.
Because of these issues, this specification aims for one-way
compatibility between AWSGI and WSGI (i.e. the ability to run AWSGI
apps on WSGI servers via middleware, but not vice versa).
More information about the Web-SIG
mailing list