[Web-SIG] PEP 444 / WSGI 2 Async
Alice Bevan–McGregor
alice at gothcandy.com
Thu Jan 6 05:01:47 CET 2011
[Apologies if this is a double- or triple-post; I seem to be having a
stupid number of connectivity problems today.]
Howdy!
Apologies for the delay in responding, it’s been a hectic start to the
new year. :)
On 2011-01-03, at 6:22 AM, Timothy Farrell wrote:
> You don't know me but I'm the author of the Rocket Web Server
> (http://pypi.python.org/pypi/rocket) and have, in the past, been
> involved in the web2py community. Like you, I'm interested in seeing
> web development come to Python3. I'm glad you're taking up WSGI2. I
> have a feature-request for it that perhaps we could work in.
Of course; in fact, I hope you don’t mind that I’ve re-posted this
response to the web-sig mailing list. Async needs significantly
broader discussion. I would appreciate it if you could reply to the
mailing list thread.
> I would like to see futures added as a server option. This way,
> controllers could dispatch emails (or run some other blocking or
> long-running task) that would not block the web-response. WSGI2
> Servers could provide a futures executor as environ['wsgi.executor']
> that the app could use to offload processes that need not complete
> before the web-request is served to the client.
E-mail dispatch is one of the things I solved a long time ago with
TurboMail; it uses a dedicated thread pool and can deliver > 100 unique
messages per second (more if you use BCC) in the default configuration,
so I don’t really see that one use case as one that can benefit from
the futures module. Updating TurboMail to use futures would be an
interesting exercise. ;)
I was thinking of exposing the executor as
environ[‘wsgi.async.executor’], with ‘wsgi.async’ being a boolean value
indicating support.
> What should the server do with the future instances?
The executor returns future instances when running executor.submit/map;
the application never generates its own Future instances. The
application may, however, use whatever executor it sees fit; it can,
for example, have one thread pool executor and one process pool, used
for different tasks.
The server itself can utilize any combination of single-threaded
IO-based async (see further on in this message), and multi-threaded or
multi-process management of WSGI requests. Resuming suspended
applications (ones pending future results) is an implementation detail
of the server.
> Should future.add_done_callback() be allowed? I'm not sure how
> practical/reliable this would be. (By the time the callback is called,
> the calling environment could be gone. Is this undefined behavior?)
If you wrap your callback in a partial(my_callback, environ) the
environ will survive the end of the request/response cycle (due to the
incremented reference count), and should be allowed to enable
intelligent behaviour in the callbacks. (Obviously the callbacks will
not be able to deliver a response to the client at the time they are
called; the body iterator can, however, wait for the future instance to
complete and/or timeout.)
A little bit later in this message I describe a better solution than
the application registering its own callbacks.
> Do we need to also specify what type of executor is provided (threaded
> vs. separate process)?
I think that’s an application-specific configuration issue, not really
the concern of the PEP.
> Do you have any thoughts about this?
I believe that intelligent servers need some way to ‘pause’ a WSGI
worker rather than relying on the worker executing in a thread and
blocking while waiting for the return value of a future. Using
generator syntax (yield) with the following rules is my initial idea:
* The application may yield None. This is a polite way to have the
async reactor (in the WSGI server/gateway) reschedule the worker for
the next reactor cycle. Useful as a hint that “I’m about do do
something that may take a moment”, allowing other workers to get a
chance to perform work. (Cooperative multi-tasking on single-threaded
async servers.)
* The application must yield one 3-tuple WSGI response, and must not
yield additional data afterwords. This is usually the last thing the
WSGI application would do, with possible cleanup code afterwords
(before falling off the bottom / raising StopIteration / returning
None).
* The application may yield Future instances returned by
environ[‘wsgi.executor’].submit/map; the worker will then be paused
pending execution of the future; the return value of the future will be
returned from the yield statement. Exceptions raised by the future
will be re-raised from the yield statement and can thus be captured in
a natural way. E.g.:
try:
complex_value = yield environ[‘wsgi.executor’].submit(long_running)
except:
pass # handle exceptions generated from within long_running
Similar rules apply to the response body iterator: it yields
bytestrings, may yield unicode strings where native strings are unicode
strings, and may yield Future instances which will pause the body
iterator as per the application callable.
Servers must:
* Allow configuration of the future implementation for options like
threading / processes.
* Allow developers to override the executor completely.
* Provide additional attributes on wsgi.input: async_ prefixed versions
of the read methods, which are factories returning server-specific
Future instances. (Allowing a single-threaded async server to handle
socket IO intelligently with select/epoll/etc.)
To the libraries you use, futures make async pretty much transparent.
E.g. libraries (such as a DB layer) must not create their own Future
objects, but must instead utilize an executor passed to them explicitly
by the application.
My ideas thus far,
— Alice.
P.s. a number of these ideas (wsgi.executor, wsgi.async, some of the
yield syntax described above) have been soundly argued against by a
co-conspirator over IRC. I’ll re-read my IRC logs and reply with those
considerations in mind (and transcribed logs) shortly.
P.p.s. my kernel panicked while I was translating my rewrite into ReST;
I'll re-do the conversion tonight or tomorrow morning and submit it
downstream ASAP.
More information about the Web-SIG
mailing list