[Python-ideas] The async API of the future: Reactors
Ben Darnell
ben at bendarnell.com
Mon Oct 15 05:20:33 CEST 2012
On Sun, Oct 14, 2012 at 10:15 AM, Guido van Rossum <guido at python.org> wrote:
>> While it's convenient to have higher-level constructors for various
>> specialized types, I'd like to emphasize that having the low-level
>> interface is important for interoperability. Tornado doesn't know
>> whether the file descriptors are listening sockets, connected sockets,
>> or pipes, so we'd just have to pass in a file descriptor with no other
>> information.
>
> Yeah, the IO object will still need to have a fileno() method.
They also need to be constructible given nothing but a fileno (but
more on this later)
>
>>> - In systems like App Engine that don't support async I/O on file
>>> descriptors at all, the constructors for creating I/O objects for disk
>>> files and connection sockets would comply with the interface but fake
>>> out almost everything (just like today, using httplib or httplib2 on
>>> App Engine works by adapting them to a "urlfetch" RPC request).
>>
>> Why would you be allowed to make IO objects for sockets that don't
>> work? I would expect that to just raise an exception. On app engine
>> RPCs would be the only supported async I/O objects (and timers, if
>> those are implemented as magic I/O objects), and they're not
>> implemented in terms of sockets or files.
>
> Here's my use case. Suppose in general one can use async I/O for disk
> files, and it is integrated with the standard (abstract) event loop.
> So someone writes a handy templating library that wants to play nice
> with async apps, so it uses the async I/O idiom to read e.g. the
> template source code. Support I want to use that library on App
> Engine. It would be a pain if I had to modify that template-reading
> code to not use the async API. But (given the right async API!) it
> would be pretty simple for the App Engine API to provide a mock
> implementation of the async file reading API that was synchronous
> under the hood. Yes, it would block while waiting for disk, but App
> Engine uses threads anyway so it wouldn't be a problem.
>
> Another, current-day, use case is the httplib interface in the stdlib
> (a fairly fancy HTTP/1.1 client, although it has its flaws). That's
> based on sockets, which App Engine doesn't have; we have a "urlfetch"
> RPC that you give a URL (and more optional stuff) and returns a record
> containing the contents and headers. But again, many useful 3rd party
> libraries use httplib, and they won't work unless we somehow support
> httplib. So we have had to go out of our way to cover most uses of
> httplib. While the app believes it is opening the connection and
> sending the request, we are actually just buffering everything; and
> when the app starts reading from the connection, we make the urlfetch
> RPC and buffer the response, which we then feed back to the app as it
> believes it is reading from the socket. As long as the app doesn't try
> to get the socket's file descriptor and call select() it will work
> fine.
>
> But some libraries *do* call select(), and here our emulation breaks
> down. It would be nicer if the standard way to do async stuff was
> higher level than select(), so that we could offer the emulation at a
> level that would integrate with the event loop -- that way, ideally
> when we have to send the urlfetch RPC we could actually return a
> Future (or whatever), and the task would correctly be suspended, just
> *thinking* it was waiting for the response on a socket, but actually
> waiting for the RPC.
Understood.
>
> Hopefully SSL provides another use case.
In posix-land, SSL isn't that different from regular sockets (using
ssl.wrap_socket from the 2.6+ stdlib). The connection process is a
little more complicated, and it gets hairy if you want to support
renegotiation, but once a connection is established you can select()
on its file descriptor and generally use it just like a regular
socket. On IOCP it's another story, though.
I've finally gotten around to reading up on IOCP and see how it's so
different from everything I'm used to (a lot of Twisted's design
decisions at the reactor level make a lot more sense now). Earlier
you had mentioned platform-specific constructors for IOObjects, but it
actually needs to be event-loop-specific: On windows you can use
select() or IOCP, and the IOObjects would be completely different for
each of them (and I do think you need to support both - select() is
kind of a second-class citizen on windows but is useful due to its
ubiquity).
This means that the event loop needs to be involved in the creation of
these objects, which is why twisted has connectTCP, listenTCP,
listenUDP, connectSSL, etc methods on the reactor interface. I think
that in order to handle both IOCP and select-style event loops you'll
need a very broad interface (roughly the union of twisted's
IReactor{Core, Time, Thread, TCP, UDP, SSL} as a minimum, with
IReactorFDSet and maybe IReactorSocket on posix for compatible with
existing posixy practices). Basically, an event loop that supports
IOCP (or hopes to support it in the future) will end up looking a lot
like the bottom couple of layers of twisted (and assuming IOCP is a
requirement I wouldn't want to stray too far from twisted's designs
here).
-Ben
More information about the Python-ideas
mailing list