[Python-ideas] The async API of the future

Sat Oct 20 10:33:07 CEST 2012

Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:

> Guido van Rossum wrote:
>> 
>> I would like people to be able to write fast
>> event handling programs on Windows too, ... But I don't know how
>> tenable that is given the dramatically different style used by IOCP
>> and the need to use native Windows API for all async I/O -- it sounds
>> like we could only do this if the library providing the I/O loop
>> implementation also wrapped all I/O operations, and that may be a bit
>> much.
> 
> That's been bothering me, too. It seems like an interface accommodating the completion-based style will have to be *extremely* fat.

No, not really.  Quite the opposite, in fact.  The way to make the interface thin is to abstract out all the details related to the particulars of the multiplexing I/O underneath everything and the transport functions necessary to read data out of it.

The main interfaces you need are here:

<http://twistedmatrix.com/documents/current/api/twisted.internet.interfaces.ITransport.html>
<http://twistedmatrix.com/documents/current/api/twisted.internet.interfaces.IProtocol.html>
<http://twistedmatrix.com/documents/current/api/twisted.internet.interfaces.IConsumer.html>
<http://twistedmatrix.com/documents/current/api/twisted.internet.interfaces.IProducer.html>

which have maybe a dozen methods between them, and could be cleaned up for a standardized version.

The interface required for unifying over completion-based and socket-based is actually much thinner than the interface you get if you start exposing sockets all over the place.

But, focusing on I/O completion versus readiness-notification is, like the triggering modes discussion, missing the forest for the trees.  Some of IOCP's triggering modes are itself an interesting example of a pattern, but, by itself, is a bit of a red herring.  Another thing you want to abstract over is pipes versus sockets versus files versus UNIX sockets versus UNIX sockets with CMSG extensions versus TLS over TCP versus SCTP versus bluetooth.  99% of applications do not care: a stream of bytes is a stream of bytes and you have to turn it into a stream of some other, higher-layer event protocol.

I would really, really encourage everyone interested in this area of design to go read all of twisted.internet.interfaces and familiarize yourselves with the contents there and make specific comments about those existing interfaces rather than some hypothetical ideal.  Also, the Twisted chapter <http://www.aosabook.org/en/twisted.html> in "the architecture of open source applications" explains some of Twisted's architectural decisions.  If you're going to re-invent the wheel, it behooves you to at least check whether the existing ones are round.  I'm happy to answer questions about specifics of how things are implemented, whether the Twisted APIs have certain limitations, and filling in gaps in the documentation.  There are certainly an embarrassing profusion of those, especially in these decade-old, core APIs that haven't changed since we started requiring docstrings; if you find any, please file bugs and I will try to do what I can to get them fixed.  But I'd rather not have to keep re-describing the basics.

> That's not just a burden for anyone implementing the interface, it's a problem for any library wanting to *wrap* it as well.

I really have no idea what you mean by this.  Writing and wrapping ITransport and IProtocol is pretty straightforward.  With the enhanced interfaces I'm working on in <http://tm.tl/1956>, it's almost automatic.

<http://twistedmatrix.com/trac/browser/trunk/twisted/protocols/tls.py>, for example, is a complete bi-directional proxying of all interfaces related to transports (even TCP transport specific APIs, not just the core interfaces above), in addition to implementing all the glue necessary for TLS, with thorough docstrings and comments, all in just over 600 lines.  This also easily deals with the fact that, for example, sometimes in order to issue a read-ready notification, TLS needs to write some bytes; and in order to issue a write-ready notification, TLS sometimes needs to read some bytes.

> For example, to maintain separation between the async layer and the generator layer, we will probably want to have an AsyncSocket object in the async layer, and a separate GeneratorSocket in the generator layer that wraps an AsyncSocket.

Yes, generator scheduling and async I/O are really different layers, as I explained in a previous email.  This is a good thing as it provides a common basis for developing things in different styles as appropriate to different problem domains.  If you smash them together you're just conflating responsibilities and requiring abstraction inversions, not making it easier to implement anything.

> If the AsyncSocket needs to provide methods for all the possible I/O operations that one might want to perform on a socket, then GeneratorSocket needs to provide its own versions of all those methods as well.

GeneratorSocket does not even need to exist in the first implementation of this kind of a spec, let alone provide all possible operations.  Python managed to survive without "all the possible I/O operations that one might want to perform on a socket" for well over a decade; sendmsg and recvmsg didn't arrive until 3.3: <http://bugs.python.org/issue6560>.

Plus, GeneratorSocket isn't that hard to write.  You just need a method for each I/O operation that returns a Future (by which I mean Deferred, of course :)) and then fires that Future from the relevant I/O operation's callback.

> Multiply that by the number of different kinds of I/O objects (files, sockets, message queues, etc. -- there seem to be quite a lot of them on Windows) and that's a *lot* of stuff to be wrapped.

The common operations here are by far the most important.  But, yes, if you want to have support for all the wacky things that Windows provides, you have to write wrappers for all the wacky things you need to call.

>> Finally, there should also be some minimal interface so that multiple I/O loops can interact -- at least in the case where one I/O loop belongs to a GUI library.
> 
> That's another thing that worries me. With a ready-based event loop, this is fairly straightforward. If you can get hold of the file descriptor or handle that the GUI is ultimately reading its input from, all you need to do is add it as an event source to your main loop, and when it's ready, tell the GUI event loop to run itself once.

No.  That is how X windows and ncurses work, not how GUIs in general work.

On Windows, the GUI is a message pump on a thread (and possibly a collection thereof); there's no discrete event which represents it and no completion port or event object that gets its I/O, but at the low level, you're still expected to write your own loop and call something that blocks waiting for GUI input.  (This actually causes some problems, see below.)

On Mac OS X, the GUI is an event loop of its own; you have to integrate with CFRunLoop via CFRunLoopRun (or something that eventually calls it, like NSApplicationMain), not write your own loop that calls a blocking function.  You don't get to invent your own thing with kqueue or select() and then explicitly observe "the GUI" as some individual discrete event; there's nothing to read, the GUI just calls directly into your application.  Underneath there's some mach messages and stuff, but I honestly couldn't tell you how that all works; it's not necessary to understand.  (And in fact "the GUI" is not actually just the GUI, but a whole host of notifications from other processes, the display, the sound device, and so on, that you can register for.  The documentation for NSNotificationCenter is illuminating.)

(I don't know anything about Android.  Can anyone comment authoritatively about that?)

This really doesn't have anything to do with the readiness-based-ness of the API, but rather that there is more on heaven and earth (and kernel interrupt handlers) than is dreamt of in your philosophy (and file descriptor dispatch functions).

Once again: the important thing is to separate out these fiddly low layers for each platform and get something that exposes the high layer that most python programmers care about - "incoming connection", "here are some bytes", "your connection was dropped" - in such a way that you can plug in an implementation that uses it to any one of these low-level things.

> But you can't do that with a completion-based main loop, because the actual reading of the input needs to be done in a different way, and that's usually buried somewhere deep in the GUI library where you can't easily change it.

Indeed not, but this phrasing makes it sound like "completion-based" main loops are some weird obscure thing.  This is not an edge-case problem you can sweep under the rug with the assumption that somebody will be able to wrestle a file descriptor out of the GUI somehow or emulate it eventually.  The GUI platforms that basically everyone in the world uses don't observe file descriptors for their input events.

>> It seems this is a solved problem (as well solved as you can hope for) to Twisted, so we should just adopt their
>> approach.
> 
> Do they actually do it for an IOCP-based main loop on Windows?

No, but it's hypothetically possible.

For GUIs, we have win32eventreactor, which can't support as many sockets, but runs the message pump, which causes the GUI to run (for most GUI toolkits).  Several low-level Windows applications have used this to good effect.  (Although I don't know of any that are open source, unfortunately.)

There's also the fact that most people writing Python GUIs want to use a cross-platform library, so most of the demand for GUI sockets on Windows have been for integrating with Wx, Qt, or GTK, and we have support for all of those separately from the IOCP stuff.  It's usually possible to call the wrapped socket functions in those libraries, but more difficult to reach below the GUI library and dispatch to it one windows message pump message at a time.

> If so, I'd be interested to know how.

It's definitely possible to get a GUI to cooperate nicely with IOCP, but it's a bit challenging to figure out how.  I had a very long, unpleasant conversation with the IOCP reactor's maintainer while we refreshed our memories about the frankly sadistic IOCP API, and put together all of our various experiences working with it, trying to refresh our collective memory to the point where we remembered enough about the way IOCP actually works to be able to explain it, so I hope you enjoy this :-).

Right now Twisted's IOCP reactor uses the mode of IOCP where it passes NULL to both the lpCompletionRoutine and lpOverlapped->hEvent member of everything (i.e. WSARecv, WSASend, WSAAccept, etc).  Later, the reactor thread blocks on GetQueuedCompletionStatus, which only blocks on the associated completion port's scheduled I/O, which precludes noticing when messages arrive from the message pump.

As I mentioned above, the message pump is a discrete source of events and can't be waited upon as a C runtime "file descriptor", WSA socket, IOCP completion or thread event.  Also, you can't translate it into one of those sources, because the message pump is associated with a particular thread; you can't call a function in a different thread to call PostQueuedCompletionStatus.

There are two ways to fix this; there already is a lengthy and confusing digression in comments in the implementation explaining parts of this.

The first, and probably easiest option, is simply to create an event with CreateEvent(bManualReset=False) and fill out the hEvent structure of all queued Event objects with that same event, pass that event handle to MsgWaitForMultipleObjectsEx.  Then, if the message queue wakes up the thread, you dispatch messages the standard way (doing what win32eventreactor already does: see win32gui.PumpWaitingMessages).  If instead, the event signals, you call GetQueuedCompletionStatus as IOCP already does, and it will always return immediately.

The second (and probably higher performance) option is to fill out the lpCompletionRoutine parameter to all I/O functions, and effectively have the reactor's "loop" integrated into the implicit asynchronous procedure dispatch of any alertable function.  This would have to be MsgWaitForMultipleObjectsEx in order to wait on events added with addEvent(), in the reactor's core.  The reactor's core itself could actually just call WaitForSingleObjectEx() and it would be roughly the same except for those external events, as long as the thread is put into an alertable state.  This option is likely higher performance because it removes all the function call and iteration overhead because you effectively go straight from the kernel to the I/O handling function.  In addition to being slightly trickier though, there's also the fact that someone else might put the thread into an alertable state and the I/O completion might be done with a surprising stack frame.

If you want to integrate this with a modern .NET application (i.e. windows platform-specific stuff), I think this is the relevant document: <http://msdn.microsoft.com/en-us/library/aa348549.aspx>; I am not sure how you'd integrate it with Wx/Tk/Qt/GTK+.

-glyph

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121020/66d542a5/attachment.html>