[Python-ideas] async: feedback on EventLoop API
Geert Jansen
geertj at gmail.com
Mon Dec 17 23:57:51 CET 2012
On Mon, Dec 17, 2012 at 6:47 PM, Guido van Rossum <guido at python.org> wrote:
> Cool. For me, right now, Python 2 compatibility is a distraction, but
> I am not against others adding it. I'll be happy to consider small
> tweaks to the PEP to make this easier. Exception: I'm not about to
> give up on 'yield from'; but that doesn't seem your focus anyway.
Correct - my focus right now is on the event loop only. I intend to
have a deeper look at the coroutine scheduler as well later (right now
i'm using greenlets for that).
> I've not used repeatable timers myself but I see them in several other
> interfaces. I do think they deserve a different method call to set
> them up, even if the implementation will just be to add a repeat field
> to the DelayedCall. When I start a timer with a 2 second repeat, does
> it run now and then 2, 4, 6, ... seconds after, or should the first
> run be in 2 seconds? Or are these separate parameters? Strawman
> proposal: it runs in 2 seconds and then every 2 seconds. The API would
> be event_loop.call_repeatedly(interval, callback, *args), returning a
> DelayedCall with an interval attribute set to the interval value.
That would work (in 2 secs, then 4, 6, ...). This is the Qt QTimer model.
Both libev and libuv have a slightly more general timer that take a
timeout and a repeat value. When the timeout reaches zero, the timer
will fire, and if repeat != 0, it will re-seed the timeout to that
value.
I haven't seen any real need for such a timer where interval !=
repeat, and in any case it can pretty cheaply be emulated by adding a
new timer on the first expiration only. So your call_repeatedly() call
above should be fine.
> (BTW, can someone *please* come up with a better name for DelayedCall?
> It's tedious and doesn't abbreviate well. But I don't want to name the
> class 'Callback' since I already use 'callback' for function objects
> that are used as callbacks.)
libev uses the generic term "Watcher", libuv uses "Handle". But their
APIs are structured a bit differently from tulip so i'm not sure if
those names would make sense. They support many different types of
events (including more esoteric events like process watches, on-fork
handlers, and wall-clock timer events). Each event has its own class
that named after the event type, and that inherits from "Watcher" or
"Handle". When an event is created, you pass it a reference to its
loop. You manage the event fully through the event instance (e.g.
starting it, setting its callback and other parameters, stopping it).
The loop has only a few methods, notably "run" and "run_once".
So for example, you'd say:
loop = Loop()
timer = Timer(loop)
timer.start(2.0, callback)
loop.run()
The advantages of this approach is that naming is easier, and that you
can also have a natural place to put methods that update the event
after you created it. For example, you might want to temporarily
suspend a timer or change its interval.
I quite liked the fresh approach taken by tulip so that's why i tried
to stay within its design. However, the disadvantage is that modifying
events after you've created them is difficult (unless you create one
DelayedCall subtype per event in which case you're probably better off
creating those events through their constructor in the first place).
>> * It would be nice to be a way to call a callback once per loop iteration.
>> An example here is dispatching in libdbus. The easiest way to do this is
>> to call dbus_connection_dispatch() every iteration of the loop (a more
>> complicated way exists to get notifications when the dispatch status
>> changes, but it is edge triggered and difficult to get right).
>>
>> This could possibly be implemented by adding a "repeat" argument to
>> call_soon().
>
> Again, I'd rather introduce a new method. What should the semantics
> be? Is this called just before or after we potentially go to sleep, or
> at some other point, or at the very top or bottom of run_once()?
That is a good question. Both libuv and libev have both options. The
one that is called before we go to sleep is called a "Prepare"
handler, the one after we come back from sleep a "Check" handler. The
libev documentation has some words on check and prepare handlers here:
http://pod.tst.eu/http://cvs.schmorp.de/libev/ev.pod#code_ev_prepare_code_and_code_ev_che
I am not sure both are needed, but i can't oversee all the consequences.
> How about the following semantics for run_once():
>
> 1. compute deadline as the smallest of:
> - the time until the first event in the timer heap, if non empty
> - 0 if the ready queue is non empty
> - Infinity(*)
>
> 2. poll for I/O with the computed deadline, adding anything that is
> ready to the ready queue
>
> 3. run items from the ready queue until it is empty
I think doing this would work but i again can't fully oversee all the
consequences. Let me play with this a little.
> (*) Most event loops I've seen use e.g. 30 seconds or 1 hour as
> infinity, with the idea that if somehow a race condition added
> something to the ready queue just as we went to sleep, and there's no
> I/O at all, the system will recover eventually. But I've also heard
> people worried about power conservation on mobile devices (or laptops)
> complain about servers that wake up regularly even when there is no
> work to do. Thoughts? I think I'll leave this out of the PEP, but what
> should Tulip do?
I had a look at libuv and libev. They take two different approaches:
* libev uses a ~60 second timeout by default. This reason is subtle.
Libev supports a wall-clock time event that fires when a certain
wall-clock time has passed. Having a non-infinite timeout will allow
it to pick up changes to the system time (e.g. by NTP), which would
change when the wall-clock timer needs to run.
* libuv does not have a wall-clock timer and uses an infinite timeout.
In my view it would be best for tulip to use an infinite timeout
unless at some point a wall-clock timer will be added. That will help
with power management. Regarding race-conditions, i think they should
be solved in other ways (e.g by having a special method that can post
callbacks to the loop in a thread-safe way and possibly write to a
self-pipe).
> Hm. The PEP currently states that you can call cancel() on the
> DelayedCall returned by e.g. add_reader() and it will act as if you
> called remove_reader(). (Though I haven't implemented this yet --
> either there would have to be a cancel callback on the DelayedCall or
> the effect would be delayed.)
Right now i think that cancelling a DelayedCall is not safe. It could
busy-loop if the fd is ready.
> But multiple callbacks per FD seems a different issue -- currently
> add_reader() just replaces the previous callback if one is already
> set. Since not every event loop can support this, I'm not sure it
> ought to be in the PEP, and making it optional sounds like a recipe
> for trouble (a library that depends on this may break subtly or only
> under pressure). Also, what's the use case? If you really need this
> you are free to implement a mechanism on top of the standard in user
> code that dispatches to multiple callbacks -- that sounds like a small
> amount of work if you really need it, but it sounds like an attractive
> nuisance to put this in the spec.
A not-so-good use case are libraries like libdbus that don't document
their assumptions regarding this. For example, i have to provide an
"add watch" function that creates a new watch (a watch is just a
generic term for an FD event that can be read, write or read|write). I
have observed that it only ever sets one read and one write watch per
FD.
If we go for one reader/writer per FD, then it's probably fine, but it
would be nice if code that does install multiple readers/writers per
FD would get an exception rather than silently updating the callback.
The requirement could be that you need to remove the event before you
can add a new event for the same FD.
>> * After a DelayedCall is cancelled, it would also be very useful to have a
>> second method to enable it again. Having that functionality is more
>> efficient than creating a new event. For example, the D-BUS event loop
>> integration API has specific methods for toggling events on and off that
>> you need to provide.
>
> Really? Doesn't this functionality imply that something (besides user
> code) is holding on to the DelayedCall after it is cancelled?
Not that i can see. At least not for libuv and libev.
> It seems
> iffy to have to bend over backwards to support this alternate way of
> doing something that we can already do, just because (on some
> platform?) it might shave a microsecond off callback registration.
According to the libdbus documentation there is a separate function to
toggle an event on/off because that could be implemented without
allocating memory.
But actually there's one kind-of idiomatic use for this that i've seen
quite a few times in libraries. Assume you have a library that defines
a connection. Often, you create two events for that connection in the
constructor: a "write_event" and a "read_event". The read_event is
normally enabled, but gets temporarily disabled when you need to
throttle input. The write_event is normally disabled except when you
get a short write on output.
Just enabling/disabling these events is a bit more friendly to the
programmer IMHO than having to cancel and recreate them when needed.
>> * (Nitpick) Multiplexing absolute and relative timeouts for the "when"
>> argument in call_later() is a little too smart in my view and can lead
>> to bugs.
>
> Agreed; that's why I left it out of the PEP. The v2 implementation
> will use time.monotonic(),
>
>> With some input, I'd be happy to produce patches.
>
> I hope I've given you enough input; it's probably better to discuss
> the specs first before starting to code. But please do review the
> tulip v2 code in the tulip subdirectory; if you want to help you I'll
> be happy to give you commit privileges to that repo, or I'll take
> patches if you send them.
OK great. Let me work on this over the next couple of days and
hopefully come up with something.
Regards,
Geert
More information about the Python-ideas
mailing list