[Python-ideas] async: feedback on EventLoop API

Mon Dec 17 18:47:22 CET 2012

On Mon, Dec 17, 2012 at 3:08 AM, Geert Jansen <geertj at gmail.com> wrote:
> below is some feedback on the EventLoop API as implemented in tulip.

Great feedback! I hope you will focus on PEP 3156
(http://www.python.org/dev/peps/pep-3156/) and Tulip v2 next; Tulip v2
isn't written but is quickly taking shape in the 'tulip' subdirectory
of the Tulip project.

> I am interested in this for an (alternate) dbus interface that I've written
> for Python that supports evented IO. I'm hoping tulip's EventLoop could be an
> abstraction as well as a default implementation that allows me to support
> just one event interface.

Nice. The more interop this event loop offers the better. I don't know
much about dbus, though, so occasionally my responses may not make any
sense -- please be gentle and educate me when my ignorance gets in the
way of understanding.

> I looked at it from two angles:
>
>  1. Does EventLoop provide everything that is needed from a library writer
>     point of view?
>  2. Can EventLoop efficiently expose a subset of the functionality of
>     some of the main event loop implementations out there today
>     (i looked at libuv, libev and Qt).
>
> First some code pointers...
>
>  * https://github.com/geertj/looping - Here i've implemented the EventLoop
>    interface for libuv, libev and Qt. It includes a slightly modified version of
>    tulip's "polling.py" where I've implemented some of the suggestions below.
>    It also adds support for Python 2.6/2.7 as the Python Qt interface (PySide)
>    doesn't support Python 3 yet.

Cool. For me, right now, Python 2 compatibility is a distraction, but
I am not against others adding it. I'll be happy to consider small
tweaks to the PEP to make this easier. Exception: I'm not about to
give up on 'yield from'; but that doesn't seem your focus anyway.

>  * https://github.com/geertj/python-dbusx - A Python interface for libdbus that
>    supports evented IO using an EventLoop interface. This module is also
>    tests all the different loops from "looping" by doing D-BUS tests with them
>    (looping itself doesn't have tests yet).

I'm actually glad to see there are so many event loop implementations
around. This suggests to me that there's a real demand for this type
of functionality, and I'd be real happy if PEP 3156 and Tulip came to
improve the interop situation (especially for Python 3.3 and beyond).

> My main points of feedback are below:
>
> * It would be nice to have repeatable timers. Repeatable timers are expected
>   for example by libdbus when integrating it with an event loop.
>
>   Without repeatable timers, I could emulate a repeatable timer by using
>   call_later() and adding a new timer every time the timer fires. This would
>   be an inefficient interface though for event loops that natively support
>   repeatable timers.
>
>   This could possibly be done by adding a "repeat" argument to call_later().

I've not used repeatable timers myself but I see them in several other
interfaces. I do think they deserve a different method call to set
them up, even if the implementation will just be to add a repeat field
to the DelayedCall. When I start a timer with a 2 second repeat, does
it run now and then 2, 4, 6, ... seconds after, or should the first
run be in 2 seconds? Or are these separate parameters? Strawman
proposal: it runs in 2 seconds and then every 2 seconds. The API would
be event_loop.call_repeatedly(interval, callback, *args), returning a
DelayedCall with an interval attribute set to the interval value.

(BTW, can someone *please* come up with a better name for DelayedCall?
It's tedious and doesn't abbreviate well. But I don't want to name the
class 'Callback' since I already use 'callback' for function objects
that are used as callbacks.)

> * It would be nice to be a way to call a callback once per loop iteration.
>   An example here is dispatching in libdbus. The easiest way to do this is
>   to call dbus_connection_dispatch() every iteration of the loop (a more
>   complicated way exists to get notifications when the dispatch status
>   changes, but it is edge triggered and difficult to get right).
>
>   This could possibly be implemented by adding a "repeat" argument to
>   call_soon().

Again, I'd rather introduce a new method. What should the semantics
be? Is this called just before or after we potentially go to sleep, or
at some other point, or at the very top or bottom of run_once()?

> * A useful semantic for run_once() would be to run the callbacks for
>   readers and writers in the same iteration as when the FD got ready.

Good catch, I've struggled with this. I ended up not needing to call
run_once(), so I've left it out of the PEP. I agree if there's a
strong enough use case for it (what's yours?) it should probably be
redesigned. Another thing I don't like about it is that a callback
that calls call_soon() with itself will starve I/O completely. OTOH
that's perhaps no worse than a callback containing an infinite loop;
and there's something to say for the semantics that if a callback just
schedules another callback as an immediate 'continuation', it's
reasonable to run that before even attempting to poll for I/O.

>   This allows for the idiom below when expecting a single event to happen
>   on a file descriptor from outside the event loop:
>
>     # handle_read() sets the "ready" flag
>     loop.add_reader(fd, handle_read)
>     while not ready:
>         loop.run_once()
>
>   I use this idiom for example in a blocking method_call() method that calls
>   into a D-BUS method.
>
>   Currently, the handle_read() callback would be called in the iteration
>   *after* the FD became readable. So this would not work, unless some more
>   IO becomes available.
>
>   As far as I can see libev, libuv and Qt all work like this.

Hm, okay, it seems reasonable to support that. (My original intent
with run_unce() was to allow mixing multiple event loops -- you'd just
call each event loop's run_once() equivalent in a round-robin
fashion.)

How about the following semantics for run_once():

1. compute deadline as the smallest of:
    - the time until the first event in the timer heap, if non empty
    - 0 if the ready queue is non empty
    - Infinity(*)

2. poll for I/O with the computed deadline, adding anything that is
ready to the ready queue

3. run items from the ready queue until it is empty

(*) Most event loops I've seen use e.g. 30 seconds or 1 hour as
infinity, with the idea that if somehow a race condition added
something to the ready queue just as we went to sleep, and there's no
I/O at all, the system will recover eventually. But I've also heard
people worried about power conservation on mobile devices (or laptops)
complain about servers that wake up regularly even when there is no
work to do. Thoughts? I think I'll leave this out of the PEP, but what
should Tulip do?

> * If remove_reader() / remove_writer() would accept the DelayedCall instance
>   returned by their add_xxx() cousins, then that would allow for multiple
>   callbacks per FD. Not all event loops support this (libuv doesn't, libev
>   and Qt do), but for the ones that do could have their functionality could
>   be exposed like this. For event loops that don't support this, an exception
>   could be raised when adding multiple callbacks per FD.

Hm. The PEP currently states that you can call cancel() on the
DelayedCall returned by e.g. add_reader() and it will act as if you
called remove_reader(). (Though I haven't implemented this yet --
either there would have to be a cancel callback on the DelayedCall or
the effect would be delayed.)

But multiple callbacks per FD seems a different issue -- currently
add_reader() just replaces the previous callback if one is already
set. Since not every event loop can support this, I'm not sure it
ought to be in the PEP, and making it optional sounds like a recipe
for trouble (a library that depends on this may break subtly or only
under pressure). Also, what's the use case? If you really need this
you are free to implement a mechanism on top of the standard in user
code that dispatches to multiple callbacks -- that sounds like a small
amount of work if you really need it, but it sounds like an attractive
nuisance to put this in the spec.

>   Support for multiple callbacks per FD could be advertised as a capability.

I'm not keen on having optional functionality as I explained above.
(In fact, I probably will change the PEP to make those APIs that are
currently marked as optional required -- it will just depend on the
platform which paradigm performs better, but using the
transport/protocol abstraction will automatically select the best
paradigm).

> * After a DelayedCall is cancelled, it would also be very useful to have a
>   second method to enable it again. Having that functionality is more
>   efficient than creating a new event. For example, the D-BUS event loop
>   integration API has specific methods for toggling events on and off that
>   you need to provide.

Really? Doesn't this functionality imply that something (besides user
code) is holding on to the DelayedCall after it is cancelled? It seems
iffy to have to bend over backwards to support this alternate way of
doing something that we can already do, just because (on some
platform?) it might shave a microsecond off callback registration.

> * (Nitpick) Multiplexing absolute and relative timeouts for the "when"
>   argument in call_later() is a little too smart in my view and can lead
>   to bugs.

Agreed; that's why I left it out of the PEP. The v2 implementation
will use time.monotonic(),

> With some input, I'd be happy to produce patches.

I hope I've given you enough input; it's probably better to discuss
the specs first before starting to code. But please do review the
tulip v2 code in the tulip subdirectory; if you want to help you I'll
be happy to give you commit privileges to that repo, or I'll take
patches if you send them.

-- 
--Guido van Rossum (python.org/~guido)