PEP 3156 EventLoop: hide details of iterations and idleness?

While working on proof-of-concept tornado/tulip integration ( https://gist.github.com/4582282), I found a few methods that could not easily be implemented on top of the tornado IOLoop because they rely on details that Tornado does not expose. While it wouldn't be hard to add support for these methods to Tornado, I would argue that they are unnecessary and expose implementation details, and so they are good candidates for removal from this already very broad interface. First, run_once and call_every_iteration both expose the event loop's underlying iterations to the application. The trouble is that the duration of one iteration is so widely variable that it's not a very useful concept (and when implementing the EventLoop interface on top of some existing event loop these methods may not be available). When is it better to use run_once instead of just using call_later to schedule a stop after a short timeout, or call_every_iteration instead of call_repeatedly? Second, while run_until_idle is convenient (especially for tests), it's kind of fragile and exposes you to implementation details in the libraries you use. If anyone uses call_repeatedly, run_until_idle won't work unless that callback is cancelled. As an example, I once had to introduce Tornado's equivalent of call_repeatedly in a library to work around a bug in libcurl. If had been using run_until_idle in my tests, they'd have all broken. I think we should either remove run_until_idle or add a "daemon" flag to call_repeatedly (and call_later, and possibly others). -Ben

On Mon, Jan 21, 2013 at 11:13 PM, Ben Darnell <ben@bendarnell.com> wrote:
- run_once() vs call_later(0) is probably the same thing and just an matter of API design. If Tornado has call_later() it might be able to emulate call_once() as call_later(0), depending on how call_once works. In Guido's latest code for example call_once() callbacks, when added inside a callback, will run in the *next* iteration. This makes call_soon() and call_later(0) the same. - call_every_iteration() vs call_repeatedly(): you really need both. I did a small proof of concept to integrate libdbus with the tulip event loop. I use call_every_iteration() to dispatch events every time after IO has happened. The idea is that events will always originate from IO, and therefore having a callback on every iteration is a convenient way to check for events that need to be dispatched. Using call_repeatedly() here is not right, because there may be times that there are 100s of events per second, and times there are none. There is no sensible fixed polling frequency. If Tornado doesn't have infrastructure for call_every_iteration() you could emulate it with a function that re-reschedules itself using call_soon() just before calling the callback. (See my first point about when call_soon() callbacks are scheduled.) If you want to see how event loop adapters for libev and libuv look like, you can check out my project here: https://github.com/geertj/looping Regards, Geert

On Tue, Jan 22, 2013 at 3:04 AM, Geert Jansen <geertj@gmail.com> wrote:
I don't understand what you mean by "events will always originate from IO" (I don't know anything about libdbus). If the events are coming from IO that causes an event loop iteration, it must be from some tulip callback. Why can't that callback be responsible for scheduling any further dispatching that may be needed?
No, because call_soon (and call_later(0)) cause the event loop to use a timeout of zero on its next poll call, so a function that reschedules itself with call_soon will be a busy loop. There is no good way to emulate call_every_iteration from the other methods; you'll either busy loop with call_soon or use a fixed timeout. If you need it it's an easy thing to offer, but since neither tornado nor twisted have such a method I'm questioning the need. run_once() will run for an unpredictable amount of time (until the next IO or timeout); run_forever() with call_soon(stop) will handle events that are ready at that moment and then stop. -Ben

On Tue, Jan 22, 2013 at 4:31 PM, Ben Darnell <ben@bendarnell.com> wrote:
I don't understand what you mean by "events will always originate from IO" (I don't know anything about libdbus).
What I meant is that if there is something to dispatch, then this is due to an inbound IO (or a timeout for that matter). Due either event, the loop will advance by one tick, and hit my call_every_iteration() handler where I dispatch.
Well your original question was why not call_repeatedly() instead of call_every_iteration(). I tried to answer that for my use case. Indeed, call_soon() could be used to schedule a dispatch every time when an IO is received. However, I preferred to have a fixed callback that I do not need to allocate and register every time, for efficiency.
Yes, you're right. I was confusing things with libuv and libev. I may have actually implemented call_soon() the wrong way there :) Maybe I am abusing call_every_iteration() when I use it for dispatching. If you look at the libuv and libev documentation, then they say that their call_every_iteration() equivalents (Prepare and Check) are for integrating with external event loops. So maybe that is the use case. However, I've not looked into this in any detail. If Tornado and Twisted cannot implement call_every_iteration(), then I think that is a good reason to remove it. Regards, Geert

On Tue, Jan 22, 2013 at 8:16 AM, Geert Jansen <geertj@gmail.com> wrote:
Ok, I'll kill call_every_iteration(). I'll wait for more discussion on run_once() and run()'s until-idle behavior. -- --Guido van Rossum (python.org/~guido)

On Tue, Jan 22, 2013 at 2:19 PM, Guido van Rossum <guido@python.org> wrote:
Ok, I'll kill call_every_iteration(). I'll wait for more discussion on run_once() and run()'s until-idle behavior.
One of the things that's been difficult for some time in Twisted is writing clients in such a way that they reliably finish. It's easy for a simple client, but when the client involves several levels of libraries doing mysterious, asynchronous things, it can be hard to know when everything's really done. Add error conditions in, and you end up spending a lot of time thinking about something that, in a synchronous program, is pretty simple. One option, recently introduced to Twisted, is "react" - http://twistedmatrix.com/documents/12.3.0/api/twisted.internet.task.html#rea... The idea is to encapsulate the lifetime of a client in a single asynchronous operation; the synchronous parallel is libc calling `exit` for you when `main` returns. If all of your library code cooperates and reliably indicates when it's done with any background operations, then this is a good choice. In cases where your libraries are less than perfect (perhaps they sync to the cloud "in the background"), the run-until-idle behavior is useful. The client calls a function that triggers a cascade of events. When that cascade has exhausted itself, the process exits. Synchronous, threaded programs do this with non-daemon threads. I think that this option should be supported, if only for the parallelism with synchronous code. As for run-until-idle - I've used this sort of behavior occasionally in tests, where I want to carefully control the sequence of operations. For example, I may want to reliably test handling of race conditions: op = start_operation() while not in_critical_section(): run_once() generate_conflict() while in_critical_section(): run_once() assert something() Such a case would rely heavily on the details of the event loop. Depending on how closely I want to tie my tests to that implementation, that may or may not be OK. If a particular event loop implementation doesn't even *have* this model (as, it appears, Tornado does not), then I think it would be fine to simply not implement this operation. So perhaps run_once() should be described as optional in the PEP? Dustin

On 24/01/13 14:57, Dustin J. Mitchell wrote:
One of the things that's been difficult for some time in Twisted is writing clients in such a way that they reliably finish.
I think I'm going to wait and see what the coroutine-level features of tulip turn out to be like before saying much more. It seems to me that many of the problems we're arguing about here simply don't exist in coroutine-land. For example, if you can write something like yield from create_http(yield from create_tcp(host, port)) and creation of the transport fails and raises an exception, then create_http never gets called, so you won't waste any effort creating an unused protocol object. Likewise, if the main loop of your protocol consists of a Task that reads asynchronously from the transport, then (as long as you haven't done anything blatantly stupid) you know it will eventually return when the connection gets closed. If I were designing all this, I think I would have made coroutines the default way of dealing with everything above the event loop layer, and provide callback wrappers for those that like to do things that way. Building an entire callback-based protocol stack seems like going about it the hard way. -- Greg

On Wed, Jan 23, 2013 at 9:14 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
I think I'm going to wait and see what the coroutine-level features of tulip turn out to be like before saying much more.
I think this is pretty smart, actually. Deferreds, futures, promises, etc. give the programmer a lot of rope. They don't require classical models of control flow, in particular. That's cool, but tends to lead to code with subtle bugs. Coroutines re-introduce just enough structure to put programmers back in comfortable territory for verifying correctness. This ends up looking a bit like threads, but with less concern for synchronization primitives, and virtually-free cloning. Dustin

On Wed, Jan 23, 2013 at 5:57 PM, Dustin J. Mitchell <dustin@v.igoro.us> wrote:
Despite some earlier moves in that direction I am not actually a fan of having optional parts in a spec. That way it's too easy for an app to claim compliance without actually running anywhere except on its "home" framework. I think that run_until_idle() can be safely replaced by run_until_complete(some_future). For run_once(), I expect that I will be able to concoct alternatives just fine as well. And, to Greg (who somehow replied in a separate thread), I amcertainly not planning to write the entire stack with only callbacks! Much of the code will have Futures on the outside and coroutines on the inside. -- --Guido van Rossum (python.org/~guido)

On Mon, Jan 21, 2013 at 11:13 PM, Ben Darnell <ben@bendarnell.com> wrote:
- run_once() vs call_later(0) is probably the same thing and just an matter of API design. If Tornado has call_later() it might be able to emulate call_once() as call_later(0), depending on how call_once works. In Guido's latest code for example call_once() callbacks, when added inside a callback, will run in the *next* iteration. This makes call_soon() and call_later(0) the same. - call_every_iteration() vs call_repeatedly(): you really need both. I did a small proof of concept to integrate libdbus with the tulip event loop. I use call_every_iteration() to dispatch events every time after IO has happened. The idea is that events will always originate from IO, and therefore having a callback on every iteration is a convenient way to check for events that need to be dispatched. Using call_repeatedly() here is not right, because there may be times that there are 100s of events per second, and times there are none. There is no sensible fixed polling frequency. If Tornado doesn't have infrastructure for call_every_iteration() you could emulate it with a function that re-reschedules itself using call_soon() just before calling the callback. (See my first point about when call_soon() callbacks are scheduled.) If you want to see how event loop adapters for libev and libuv look like, you can check out my project here: https://github.com/geertj/looping Regards, Geert

On Tue, Jan 22, 2013 at 3:04 AM, Geert Jansen <geertj@gmail.com> wrote:
I don't understand what you mean by "events will always originate from IO" (I don't know anything about libdbus). If the events are coming from IO that causes an event loop iteration, it must be from some tulip callback. Why can't that callback be responsible for scheduling any further dispatching that may be needed?
No, because call_soon (and call_later(0)) cause the event loop to use a timeout of zero on its next poll call, so a function that reschedules itself with call_soon will be a busy loop. There is no good way to emulate call_every_iteration from the other methods; you'll either busy loop with call_soon or use a fixed timeout. If you need it it's an easy thing to offer, but since neither tornado nor twisted have such a method I'm questioning the need. run_once() will run for an unpredictable amount of time (until the next IO or timeout); run_forever() with call_soon(stop) will handle events that are ready at that moment and then stop. -Ben

On Tue, Jan 22, 2013 at 4:31 PM, Ben Darnell <ben@bendarnell.com> wrote:
I don't understand what you mean by "events will always originate from IO" (I don't know anything about libdbus).
What I meant is that if there is something to dispatch, then this is due to an inbound IO (or a timeout for that matter). Due either event, the loop will advance by one tick, and hit my call_every_iteration() handler where I dispatch.
Well your original question was why not call_repeatedly() instead of call_every_iteration(). I tried to answer that for my use case. Indeed, call_soon() could be used to schedule a dispatch every time when an IO is received. However, I preferred to have a fixed callback that I do not need to allocate and register every time, for efficiency.
Yes, you're right. I was confusing things with libuv and libev. I may have actually implemented call_soon() the wrong way there :) Maybe I am abusing call_every_iteration() when I use it for dispatching. If you look at the libuv and libev documentation, then they say that their call_every_iteration() equivalents (Prepare and Check) are for integrating with external event loops. So maybe that is the use case. However, I've not looked into this in any detail. If Tornado and Twisted cannot implement call_every_iteration(), then I think that is a good reason to remove it. Regards, Geert

On Tue, Jan 22, 2013 at 8:16 AM, Geert Jansen <geertj@gmail.com> wrote:
Ok, I'll kill call_every_iteration(). I'll wait for more discussion on run_once() and run()'s until-idle behavior. -- --Guido van Rossum (python.org/~guido)

On Tue, Jan 22, 2013 at 2:19 PM, Guido van Rossum <guido@python.org> wrote:
Ok, I'll kill call_every_iteration(). I'll wait for more discussion on run_once() and run()'s until-idle behavior.
One of the things that's been difficult for some time in Twisted is writing clients in such a way that they reliably finish. It's easy for a simple client, but when the client involves several levels of libraries doing mysterious, asynchronous things, it can be hard to know when everything's really done. Add error conditions in, and you end up spending a lot of time thinking about something that, in a synchronous program, is pretty simple. One option, recently introduced to Twisted, is "react" - http://twistedmatrix.com/documents/12.3.0/api/twisted.internet.task.html#rea... The idea is to encapsulate the lifetime of a client in a single asynchronous operation; the synchronous parallel is libc calling `exit` for you when `main` returns. If all of your library code cooperates and reliably indicates when it's done with any background operations, then this is a good choice. In cases where your libraries are less than perfect (perhaps they sync to the cloud "in the background"), the run-until-idle behavior is useful. The client calls a function that triggers a cascade of events. When that cascade has exhausted itself, the process exits. Synchronous, threaded programs do this with non-daemon threads. I think that this option should be supported, if only for the parallelism with synchronous code. As for run-until-idle - I've used this sort of behavior occasionally in tests, where I want to carefully control the sequence of operations. For example, I may want to reliably test handling of race conditions: op = start_operation() while not in_critical_section(): run_once() generate_conflict() while in_critical_section(): run_once() assert something() Such a case would rely heavily on the details of the event loop. Depending on how closely I want to tie my tests to that implementation, that may or may not be OK. If a particular event loop implementation doesn't even *have* this model (as, it appears, Tornado does not), then I think it would be fine to simply not implement this operation. So perhaps run_once() should be described as optional in the PEP? Dustin

On 24/01/13 14:57, Dustin J. Mitchell wrote:
One of the things that's been difficult for some time in Twisted is writing clients in such a way that they reliably finish.
I think I'm going to wait and see what the coroutine-level features of tulip turn out to be like before saying much more. It seems to me that many of the problems we're arguing about here simply don't exist in coroutine-land. For example, if you can write something like yield from create_http(yield from create_tcp(host, port)) and creation of the transport fails and raises an exception, then create_http never gets called, so you won't waste any effort creating an unused protocol object. Likewise, if the main loop of your protocol consists of a Task that reads asynchronously from the transport, then (as long as you haven't done anything blatantly stupid) you know it will eventually return when the connection gets closed. If I were designing all this, I think I would have made coroutines the default way of dealing with everything above the event loop layer, and provide callback wrappers for those that like to do things that way. Building an entire callback-based protocol stack seems like going about it the hard way. -- Greg

On Wed, Jan 23, 2013 at 9:14 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
I think I'm going to wait and see what the coroutine-level features of tulip turn out to be like before saying much more.
I think this is pretty smart, actually. Deferreds, futures, promises, etc. give the programmer a lot of rope. They don't require classical models of control flow, in particular. That's cool, but tends to lead to code with subtle bugs. Coroutines re-introduce just enough structure to put programmers back in comfortable territory for verifying correctness. This ends up looking a bit like threads, but with less concern for synchronization primitives, and virtually-free cloning. Dustin

On Wed, Jan 23, 2013 at 5:57 PM, Dustin J. Mitchell <dustin@v.igoro.us> wrote:
Despite some earlier moves in that direction I am not actually a fan of having optional parts in a spec. That way it's too easy for an app to claim compliance without actually running anywhere except on its "home" framework. I think that run_until_idle() can be safely replaced by run_until_complete(some_future). For run_once(), I expect that I will be able to concoct alternatives just fine as well. And, to Greg (who somehow replied in a separate thread), I amcertainly not planning to write the entire stack with only callbacks! Much of the code will have Futures on the outside and coroutines on the inside. -- --Guido van Rossum (python.org/~guido)
participants (5)
-
Ben Darnell
-
Dustin J. Mitchell
-
Geert Jansen
-
Greg Ewing
-
Guido van Rossum