[Python-ideas] Async API: some code to review

Wed Oct 31 11:07:10 CET 2012

> -----Original Message-----
> From: gvanrossum at gmail.com [mailto:gvanrossum at gmail.com] On Behalf
> Of Guido van Rossum
> Sent: 30. október 2012 16:40
> To: Kristján Valur Jónsson
> Cc: Richard Oudkerk; python-ideas at python.org
> Subject: Re: [Python-ideas] Async API: some code to review
> 
> What kind of time savings are we talking about? I imagine that the
> accept() loop I put in tulip/echosvr.py is fast enough in terms of response
> time (latency) -- throughput would seem the more important measure (and I
> have no idea of this yet).
> http://code.google.com/p/tulip/source/browse/echosvr.py#37
> 
To be honest, it isn't serious for applications that serve few connections, but for things like web servers, It becomes important.
Looking at your code:
c

a) will always "block", causing the main thread (using the term loosely here) to once through the event loop, possibly doing other housekeepeing, even if a connection was available.  I don't think there is no way to selectively do completion based io, i.e. do immediate mode if possible.  You either go for one or the other on windows, at least.  in select based mecanisms it could be possible to do a select here first and avoid that extra loop, but for the sake of the application it might be confusing.  It might be best to stick to one system.
b) will either switch to the net task immediately (possible in stackless) or cause the srtart of t to wait until the next round in the event loop.

I this case, t will not start executing until after going around the loop twice.  A new connection can only be accepted each loop.  Imagine two http requests coming in simultaneously, at t=0

The sequence of operations will then be this (assuming FIFO scheduling)
main loop runs
accept 1 returns. task 1 created.  accept 2 scheduled
main loop runs making task 1 and accep2 runnable
task 1 runs.  does processing. performs send, and blocks
accept2 returns, task2 created
main loop runs, making task2 runnable
task2 runs, does processing, performs send.

Contributing to latency in this scenario are all the "main loop" runs.  Note that I may misunderstand the way your architecture works, perhaps there is no main loop, perhaps everything is interleaved.

An alternative something like this:
def loop():
        while True:
                conn, addr = yield from listener.accept()
                handler(conn, addr)
for I in range(n_handlers):
        t = scheduling.Task(loop)
        t.start()

Here, events will be different:
main loop runs, accept 1 and accept 2 runnable
accept 1 returns, stariting handler, processing and blocking on send
accept 2 returns, starting handler, processing, and blocking on send

As you see, there is only one initial housekeeping run needed to make both tasklets runnable and ready to run without interruption, giving the lowest possible total latency to the client.

In my expericene with RPC systems based this kind of asynchronous python IO, lowering the response time from when user space is made aware of the request and when python actually starts _processing_ it is critical to responsiveness..

Cheers