[Web-SIG] Server-side async API implementation sketches
Alice Bevan–McGregor
alice at gothcandy.com
Sun Jan 9 12:36:14 CET 2011
On 2011-01-08 19:34:41 -0800, P.J. Eby said:
> At 04:40 AM 1/9/2011 +0200, Alex Grönholm wrote:
>> 09.01.2011 04:15, Alice BevanMcGregor kirjoitti:
>>> I hope that clearly identifies my idea on the subject. Since
>>> async>>servers will /already/ be implementing their own executors, I
>>> don't>>see this as too crazy.
>> -1 on this. Those executors are meant for executing code in a
>> thread>pool. Mandating a magical socket operation filter here
>> would>considerably complicate server implementation.
>
> Actually, the *reverse* is true. If you do it the way Alice proposes,
> my sketches don't get any more complex, because the filtering goes in
> the executor facade or submit function.
Indeed; the executor is what then adds the file descriptor to the
underlying server async reactor (select/epoll/kqueue/other). In the
case of the Marrow server, this would utilize a reactor callback (some
might say "deferred") to update the Future instance with the data,
setting completion status, executing callbacks, etc. One might even be
able to use a threading.Event (or whatever is the opposite of a lock)
to wake up blocking .result() calls, even if not multi-threaded
(greenthreads, etc.).
Of course, adding the file descriptor to a pure async reactor then
.result() blocking on it from your application would result in a
deadlock; the .result() would never complete as the reactor would never
get a chance to perform the pending request. (This is why Marrow
requires threading be enabled globally before adding an executor to the
environment; this requires rather explicit documentation.) This
problem is solved completely by yielding the future instance (pausing
the application) to let the reactor do its thing. (Yielding the future
becomes a replacement for the blocking behaviour of future.result().)
Effectively what I propose adds emulation of threading on top of async
by mutating an Executor. (The Executor would be a mixed
threading+async executor.)
I suggest bubbling a future back up the yield stack instead of the
actual result to allow the application (or middleware, or whatever
happened to yield the future) to capture exceptions generated by the
future'd request. Bubbling the future instance avoids excessive
exception handling cruft in each middleware layer; and I see no real
issue with this. AFIK, you can use a shorthand (possibly wrapped in a
try: block) if all you care about is the result:
data = (yield my_future).result()
> Truthfully, I don't really see the point of exposing the map() method
> (which is the only other executor method we'd expose), so it probably
> makes more sense to just offer a 'wsgi.submit' key... which can be a
> function as follows: [snip]
True; the executor itself could easily be hidden behind the filter. In
a multi-threaded environment, however, the map call poses no problem,
and can be quite useful. (E.g. with one of my use cases for inclusion
of an executor in the environment: image scaling.)
> Granted, this might be a rather long function. However, since it's
> essentially an optimization, a given server can decide how many
> functions can be shortcut in this way. The spec may wish to offer a
> guarantee or recommendation for specific methods of certain
> stdlib-provided types (sockets in particular) and wsgi.input.
+1
> Personally, I do think it might be *better* to offer extended
> operations on wsgi.input that could be used via yield, e.g. "yield
> input.nb_read()". But of course then the trampoline code has
> torecognize those values instead of futures.
Because wsgi.input is provided by the server, and the executor is
provided by the server, is there a reason why these extended functions
couldn't return... futures? :)
> Note, too, that this complexity also only affects servers that want to
> offer a truly async API. A synchronous server has no reason to pay
> particular attention to what's in a future, since it can't offer any
> performance improvement.
I feel a sync server and async server should provide the same API for
accessing the input. E.g. the application/middleware must be agnostic
to the server in this regard. This is why a little bit of magic goes a
long way. The following code would work on any WSGI2 stack that offers
an executor (sync, async, or provided by middleware):
data = (yield env['wsgi.submit'](env['wsgi.input'].read, 4096)).result()
In a sync server, the blocking read would execute in another thread.
In an async one appropriate actions would be taken to request a socket
read from the client. Both cases pause the application pending the
result. (If you don't immediately yield the future the behaviour
between servers is the same!)
> I do think that this sort of API discussion, though, is the most
> dangerous part of trying to do an async spec. That is, I don'texpect
> that everyone will spontaneously agree on the exact same API. Alice's
> proposal (simply submitting object methods) has theadvantage of
> severely limiting the scope of API discussions. ;-)
Since each async server will either implement or utilize a specific
async framework, each will offer its own "async-supported" featureset.
What I mean is that all servers should make wsgi.input calls
async-able, some would go further to make all socket calls async. Some
might go even further than that and define an API for external
libraries (e.g. DBs) to be truly cooperatively async. I do believe my
solution is flexible enough for the majority of use cases, and where it
isn't (i.e. would block) "abusing" futures in this way will allow an
application to reasonalby fake async without killing async server (who
are internally single-threaded) performance by delegating blocking
calls.
I will have to experiment with determining the type of the class
instance a method is bound to from the bound method itself; this is the
crux of the implementation I suggest. If you can't get that, the idea
is pooched for anything but wsgi.input which the server would have a
direct reference to anyway.
I hope the clarity of this post didn't degenerate too much over the few
hours I had it open and noodling around.
- Alice.
More information about the Web-SIG
mailing list