[Python-ideas] The async API of the future

Fri Oct 19 18:05:05 CEST 2012

Work priorities don't allow me to spend another day replying in detail
to the various emails on this topic, but I am still keeping up
reading!

I have read Greg's response to my comparison between
Future+yield-based coroutines and his yield-from-based, Future-free
coroutines, and after having written a small prototype, I am now
pretty much convinced that Greg's way is superior. This doesn't mean
you can't use generators or yield-from for other purposes! It's just
that *if* you are writing a coroutine for use with a certain schedule,
you must use yield and yield-from in accordance to the scheduler's
rules. However, code you call can still use yield and yield-from for
iteration, and you can still use for-loops. In particular, if f is a
coroutine, it can still write "for x in g(): ..." where g is a
generator meant to be an iterator. However if g were instead a
coroutine, f should call it using "yield from g()", and f and g should
agree on the interface of their scheduler.

As to other topics, my current feeling is that we should try to
separately develop requirements and prototype implementations of the
I/O loop of the future, and to figure the loosest possible coupling
between that and a coroutine scheduler (or any other type of
scheduler). In particular, I think the I/O loop should not assume the
event handlers are implemented using coroutines -- but if someone
wants to write an awesome coroutine scheduler, they should be able to
delegate all their I/O waiting needs to the I/O loop with very little
trouble.

To me, this means that the I/O loop probably should use "plain"
callback functions (i.e., not Futures, Deferreds or coroutines). We
should also standardize the interface to the I/O loop so that 3rd
parties can plug in their own I/O loop -- I don't see an end to the
debate whether the best C library for event handling is libevent,
libev or libuv.

While the focus of the I/O loop should be on single-threaded event
handling, some standard interface should exist so that you can run
certain code in a separate thread and wait for its completion -- I've
found this handy when calling socket.getaddrinfo(), which may block.
(Apparently async DNS lookups are really hard -- I read some
complaints about libevent's DNS lookups, and IIUC many Firefox
lockups are due to this.) But there may be other uses for this too.

An issue in the design of the I/O loop is the strain between a
ready-based and completion-based design. The typical Unix design
(whether based on select or any of the poll variants) is usually
ready-based; but on Windows, the only way to get high performance is
to base it on IOCP, which is completion-based (i.e. you start a
specific async operation, like writing N bytes, and the I/O loop tells
you when it is done). I would like people to be able to write fast
event handling programs on Windows too, and ideally the only change
would be the implementation of the I/O loop. But I don't know how
tenable that is given the dramatically different style used by IOCP
and the need to use native Windows API for all async I/O -- it sounds
like we could only do this if the library providing the I/O loop
implementation also wrapped all I/O operations, andthat may be a bit
much.

Finally, there should also be some minimal interface so that multiple
I/O loops can interact -- at least in the case where one I/O loop
belongs to a GUI library. It seems this is a solved problem (as well
solved as you can hope for) to Twisted, so we should just adopt their
approach.

-- 
--Guido van Rossum (python.org/~guido)