Late to the async party (PEP 3156)

Hi python-ideas, I've been somewhat living under a rock for the past few months and consequently I missed the ideal window of opportunity to weigh in on the async discussions this fall that culminated into PEP 3156. I've been reading through those discussions in the archives. I've not finished digesting it all, and I'm somewhat torn in that I feel I should shut up until I read everything to date so as not to decrease the SNR, but on the other hand, knowing myself, I strongly suspect this would result in my never speaking up. And so, at risk of lowering the SNR ... First let me say that PEP 3156 makes me very, very happy. Over the past few years I've been exploring these very ideas with a little-used library called Kaa. I'm not offering it up as a paragon of proper async library design, but I wanted to share some of my experiences in case they could be useful to the PEP. https://github.com/freevo/kaa-base/ http://api.freevo.org/kaa-base/ It does seem like many similar design choices were made. In particular, I'm happy that an explicit yield will be used rather than the greenlet style of implicit suspension/reentry. Even after I've been using them for years, coroutines often feel like a form of magic, and an explicit yield is more aligned with the principle of least surprise. With Kaa, our future-style object is called an InProgress (so forgive the differing terminology in the remainder of this post): http://api.freevo.org/kaa-base/async/inprogress.html A couple properties of InProgress objects that I've found have practical value: * they can be aborted, which raises a special InProgressAborted inside the coroutine function so it can perform cleanup actions o what makes this tricky is the question of what to do to any currently yielded tasks? If A yields B and A is aborted, should B be aborted? What if the same B task is being yielded by C? Should C also be aborted, even if it's considered a sibling of A? (For example, suppose B is a task that is refreshing some common cache that both A and C want to make sure is up-to-date before they move on.) o if the decision is B should be aborted, then within A, 'yield B' will raise an exception because A is aborted, but 'yield B' within C will raise because B was aborted. So there needs to be some mechanism to distinguish between these cases. (My approach was to have an origin attribute on the exception.) o if A yields B, it may want to prevent B from being aborted if A is aborted. (My approach was to have a noabort() method in InProgress objects to return a new, unabortable InProgress object that A can then yield.) o alternatively, the saner implementation may be to do nothing to B when A is aborted and require A catch InProgressAborted and explicitly abort B if that's the desired behaviour o discussion in the PEP on cancellation has some TBDs so perhaps the above will be food for thought * they have a timeout() method, which returns a new InProgress object representing the task that will abort when the timeout elapses if the task doesn't finish o it's noteworthy that timeout() returns a /new/ InProgress and the original task continues on even if the timeout occurs -- by default that is, unless you do timeout(abort=True) o I didn't see much discussion in the PEP on timeouts, but I think this is an important feature that should be standardized Coroutines in Kaa use "yield" rather than "yield from" but the general approach looks very similar to what's been proposed: http://api.freevo.org/kaa-base/async/coroutines.html The @coroutine decorator causes the decorated function to return an InProgress. Coroutines can of course yield other coroutines, but, more fundamentally, anything else that returns an InProgress object, which could be a @threaded function, or even an ordinary function that explicitly creates and returns an InProgress object. There are some features of Kaa's implementation that could be worth considering: * it is possible to yield a special object (called NotFinished) that allows a coroutine to "time slice" as a form of cooperative multitasking * coroutines can have certain policies that control invocation behaviour. The most obvious ones to describe are POLICY_SYNCHRONIZED which ensures that multiple invocations of the same coroutine are serialized, and POLICY_SINGLETON which effectively ignores subsequent invocations if it's already running * it is possible to have a special progress object passed into the coroutine function so that the coroutine's progress can be communicated to an outside observer Once you've standardized on a way to manage the lifecycle of an in-progress asynchronous task, threads are a natural extension: http://api.freevo.org/kaa-base/async/threads.html The important element here is that @threaded decorated functions can be yielded by coroutines. This means that truly blocking tasks can be wrapped in a thread but invocation from a coroutine is identical to any other coroutine. Consequently, a threaded task could later be implemented as a coroutine (or more generally via event loop hooks) without any API changes. I think I'll stop here. There's plenty more definition, discussion, and examples in the links above. Hopefully some ideas can be salvaged for PEP 3156, but even if that's not the case, I'll be happy to know they were considered and rejected rather than not considered at all. Cheers, Jason.

Hi Jason, I don't think you've missed anything. I had actually planned to keep PEP 3156 unpublished for a bit longer, since I'm not done writing the reference implementation -- I'm sure that many of the issues currently marked open or TBD will be resolved that way. There hasn't been any public discussion since the last threads on python-ideas some weeks ago -- however I've met in person with some Twisted folks and exchanged private emails with some other interested parties. You've also correctly noticed that the PEP is weakest in the area of cancellation (and timeouts aren't even mentioned in the current draft). I'm glad you have some experience in this area, and I'll try to study your solutions and suggestions in more detail soon. For integration with threads, I'm thinking that the PEP currently has the minimum needed with wrap_future() and run_in_executor() -- but I'll read your link on threads and see what I may be missing. (More later, but I don't want you to think you posted into a black hole!) --Guido On Sat, Dec 15, 2012 at 5:36 PM, Jason Tackaberry <tack@urandom.ca> wrote:
-- --Guido van Rossum (python.org/~guido)

On Sat, 15 Dec 2012 21:37:15 -0800 Guido van Rossum <guido@python.org> wrote:
For the record, have you looked at the pyuv API? It's rather nicely orthogonal, although it lacks a way to stop the event loop. https://pyuv.readthedocs.org/en Regards Antoine.

Antoine Pitrou <solipsis@...> writes:
That link gives a 404, but you can use https://pyuv.readthedocs.org/en/latest/ Regards, Vinay Sajip

I have to ask someone who has experience with libuv to comment on my PEP -- those docs are very low level and don't explain how things work together or why features are needed. I also have to explain my goals and motivations. But not now. --Guido On Sunday, December 16, 2012, Vinay Sajip wrote:
-- --Guido van Rossum (on iPad)

On Dec 16, 2012, at 11:16 AM, Antoine Pitrou <solipsis@pitrou.net> wrote:
For now the only way to stop the event loop is either to stop any events in trigger its execution in a loop: while True: if loop.run_once(): … continue If you have any questions about it I can help. I plan to use it in my own lib and already use it in gaffer [1]. One of the advantage of libuv is its multi-platform support: on windows it is using IOCP, on unix, plain sockets apis , etc… - benoît [1] http://github.com/benoitc/gaffer

On Sat, Dec 15, 2012 at 5:36 PM, Jason Tackaberry <tack@urandom.ca> wrote:
- You can't cancel a coroutine; however you can cancel a Task, which is a Future wrapping a stack of coroutines linked via yield-from. - Cancellation only takes effect when a task is suspended. - When you cancel a Task, the most deeply nested coroutine (the one that caused it to be suspended) receives a special exception (I propose to reuse concurrent.futures.CancelledError from PEP 3148). If it doesn't catch this it bubbles all the way to the Task, and then out from there. - However when a coroutine in one Task uses yield-from to wait for another Task, the latter does not automatically get cancelled. So this is a difference between "yield from foo()" and "yield from Task(foo())", which otherwise behave pretty similarly. Of course the first Task could catch the exception and cancel the second task -- that is its responsibility though and not the default behavior. - PEP 3156 has a par() helper which lets you block for multiple tasks/coroutines in parallel. It takes arguments which are either coroutines, Tasks, or other Futures; it wraps the coroutines in Tasks to run them independently an just waits for the other arguments. Proposal: when the Task containing the par() call is cancelled, the par() call intercepts the cancellation and by default cancels those coroutines that were passed in "bare" but not the arguments that were passed in as Tasks or Futures. Some keyword argument to par() may be used to change this behavior to "cancel none" or "cancel all" (exact API spec TBD). 3156) the Task() constructor has an optional timeout argument. It works by scheduling a callback at the given time in the future, and the callback simply cancel the task (which is a no-op if the task has already completed). It works okay, except it generates tracebacks that are sometimes logged and sometimes not properly caught -- though some of that may be my messy test code. The exception raised by a timeout is the same CancelledError, which is somewhat confusing. I wonder if Task.cancel() shouldn't take an exception with which to cancel the task with. (TimeoutError in PEP 3148 has a different role, it is when the timeout on a specific wait expires, so e.g. fut.result(timeout=2) waits up to 2 seconds for fut to complete, and if not, the call raises TimeoutError, but the code running in the executor is unaffected.)
We've had long discussions about yield vs. yield-from. The latter is way more efficient and that's enough for me to push it through. When using yield, each yield causes you to bounce to the scheduler, which has to do a lot of work to decide what to do next, even if that is just resuming the suspended generator; and the scheduler is responsible for keeping track of the stack of generators. When using yield-from, calling another coroutine as a subroutine is almost free and doesn't involve the scheduler at all; thus it's much cheaper, and the scheduler can be simpler (doesn't need to keep track of the stack). Also stack traces and debugging are better.
These seem pretty esoteric and can probably implemented in user code if needed.
As I said, I think wait_for_future() and run_in_executor() in the PEP give you all you need. The @threaded decorator you propose is just sugar; if a user wants to take an existing API and convert it from a coroutine to threaded without requiring changes to the caller, they can just introduce a helper that is run in a thread with run_in_executor().
Thanks for your very useful contribution! Kaa looks like an interesting system. Is it ported to Python 3 yet? Maybe you could look into integrating with the PEP 3156 event loop and/or scheduler. -- --Guido van Rossum (python.org/~guido)

On 12-12-16 11:27 AM, Guido van Rossum wrote:
I'll just underline your statement that "you can't cancel a coroutine" here, since I'm referencing it later. This distinction between "bare" coroutines, Futures, and Tasks is a bit foreign to me, since in Kaa all coroutines return (a subclass of) InProgress objects. The Tasks section in the PEP says that a bare coroutine (is this the same as the previously defined "coroutine object"?) has much less overhead than a Task but it's not clear to me why that would be, as both would ultimately need to be managed by the scheduler, wouldn't they? I could imagine that a coroutine object is implemented as a C object for performance, and a Task is a Python class, and maybe that explains the difference. But then why differentiate between Future and Task (particularly because they have the same interface, so I can't draw an analogy with jQuery's Deferreds and Promises, where Promises are a restricted form of Deferreds for public consumption to attach callbacks).
* Cancellation only takes effect when a task is suspended.
Yes, this is intuitive.
So if the most deeply nested coroutine catches the CancelledError and doesn't reraise, it can prevent its cancellation? I took a similar appoach, except that coroutines can't abort their own cancellation, and whether or not the nested coroutines actually get cancelled depends on whether something else was interested in their result. Consider a coroutine chain where A yields B yields C yields D, and we do B.abort() * if only C was interested in D's result, then D will get an InProgressAborted raised inside it (at whatever point it's currently suspended). If something other than C was also waiting on D, D will not be affected * similarly, if only B was interested in C's result, then C will get an InProgressAborted raised inside it (at yield D). * B will get InProgressAborted raised inside it (at yield C) * for B, C and D, the coroutines will not be reentered and they are not allowed to yield a value that suggests they expect reentry. There's nothing a coroutine can do to prevent its own demise. * A will get an InProgressAborted raised inside it (at yield B) * In all the above cases, the InProgressAborted instance has an origin attribute that is B's InProgress object * Although B, C, and D are now aborted, A isn't aborted. It's allowed to yield again. * with Kaa, coroutines are abortable by default (so they are like Tasks always). But in this example, B can present C from being aborted by yielding C().noabort() There are quite a few scenarios to consider: A yields B and B is cancelled or raises; A yields B and A is cancelled or raises; A yields B, C yields B, and A is cancelled or raises; A yields B, C yields B, and A or C is cancelled or raises; A yields par(B,C,D) and B is cancelled or raises; etc, etc. In my experience, there's no one-size-fits-all behaviour, and the best we can do is have sensible default behaviour with some API (different functions, kwargs, etc.) to control the cancellation propagation logic.
Ok, so nested bare coroutines will get cancelled implicitly, but nested Tasks won't? I'm having a bit of difficulty with this one. You said that coroutines can't be cancelled, but Tasks can be. But here, if they are being yielded, the opposite behaviour applies: yielded coroutines /are/ cancelled if a Task is cancelled, but yielded tasks /aren't/. Or have I misunderstood?
Here again, par() would cancel a bare coroutine but not Tasks. It's consistent with your previous bullet but seems to contradict your first bullet that you can't cancel a coroutine. I guess the distinction is you can't explicitly cancel a coroutine, but coroutines can be implicitly cancelled? As I discussed previously, one of those tasks might be yielded by some other active coroutine, and so cancelling it may not be the right thing to do. Being able to control this behaviour is important, whether that's a par() kwarg, or special method like noabort() that constructs an unabortable Task instance. Kaa has similar constructs to allow yielding a collection of InProgress objects (whatever they might represent: coroutines, threaded functions, etc.). In particular, it allows you to yield multiple tasks and resume when ALL of them complete (InProgressAll), or when ANY of them complete (InProgressAny). For example: @kaa.coroutine() def is_any_host_up(*hosts): try: # ping() is a coroutine yield kaa.InProgressAny(ping(host) for host in hosts).timeout(5, abort=True) except kaa.TimeoutException: yield False else: yield True More details here: http://api.freevo.org/kaa-base/async/inprogress.html#inprogress-collections From what I understand of the proposed par() it would require//ALL of the supplied futures to complete, but there are many use-cases for the ANY variant as well.
FWIW, the equivalent in Kaa which is InProgress.abort() does take an optional exception, which must subclass InProgressAborted. If None, a new InProgressAborted is created. InProgress.timeout(t) will start a timer that invokes InProgress.abort(TimeoutException()) (TimeoutException subclasses InProgressAborted). It sounds like your proposed implementation works like: @tulip.coroutine() def foo(): try: result = yield from Task(othercoroutine()).result(timeout=2) except TimeoutError: # ... othercoroutine() still lives on I think Kaa's syntax is cleaner but it seems functionally the same: @kaa.coroutine() def foo(): try: result = yield othercoroutine().timeout(2) except kaa.TimeoutException: # ... othercoroutine() still lives on It's also possible to conveniently ensure that othercoroutine() is aborted if the timeout elapses: try: result = yield othercoroutine().timeout(2, abort=True) except kaa.TimeoutException: # ... othercoroutine() is aborted
But this sounds like a consequence of a particular implementation, isn't it? A @kaa.coroutine() decorated function is entered right away when invoked, and the decorator logic does as much as it can until the underlying generator yields an unfinished InProgress that needs to wait for (or kaa.NotFinished). Once it yields, /then/ the decorator sets up the necessary hooks with the scheduler / event loop. This means you can nest a stack of coroutines without involving the scheduler until something truly asynchronous needs to take place. Have I misunderstood?
I'm fine with that, provided the flexibility is there to allow for it.
Also works for me. :)
Kaa does work with Python 3, yes, although it still lacks very much needed unit tests so I'm not completely confident it has the same functional coverage as Python 2. I'm definitely interested in having it conform to whatever shakes out of PEP 3156, which is why I'm speaking up now. :) I've a couple other subjects I should bring up: Tasks/Futures as "signals": it's often necessary to be able to resume a coroutine based on some condition other than e.g. any IO tasks it's waiting on. For example, in one application, I have a (POLICY_SINGLETON) coroutine that works off a download queue. If there's nothing in the queue, it's suspended at a yield. It's the coroutine equivalent of a dedicated thread. [1] It must be possible to "wake" the queue manager when I enqueue a job for it. Kaa has this notion of "signals" which is similar to the gtk+ style of signals in that you can attach callbacks to them and emit them. Signals can be represented as InProgress objects, which means they can be yielded from coroutines and used in InProgressAny/All objects. So my download manager coroutine can yield an InProgressAny of all the active download coroutines /and/ the "new job enqueued" signal, and execution will resume as long as any of those conditions are met. Is there anything in your current proposal that would allow for this use-case? [1] https://github.com/jtackaberry/stagehand/blob/master/src/manager.py#L390 Another pain point for me has been this notion of unhandled asynchronous exceptions. Asynchronous tasks are represented as an InProgress object, and if a task fails, accessing InProgress.result will raise the exception at which point it's considered handled. This attribute access could happen at any time during the lifetime of the InProgress object, outside the task's call stack. The desirable behaviour is that when the InProgress object is destroyed, if there's an exception attached to it from a failed task that hasn't been accessed, we should output the stack as an unhandled exception. In Kaa, I do this with a weakref destroy callback, but this isn't ideal because with GC, the InProgress might not be destroyed until well after the exception is relevant. I make every effort to remove reference cycles and generally get the InProgress object destroyed as early as possible, but this changes subtly between Python versions. How will unhandled asynchronous exceptions be handled with tulip? Thanks! Jason.

On Sun, Dec 16, 2012 at 11:11 AM, Jason Tackaberry <tack@urandom.ca> wrote:
Task is a subclass of Future; a Future may wrap some I/O or some other system call, but a Task wraps a coroutine. Bare coroutines are introduced by PEP 380, so it's no surprise you have to get used to them. But trust me they are useful. I have a graphical representation in my head; drawing with a computer is not my strong point, but here's some ASCII art: [Task: coroutine -> coroutine -> ... -> coroutine) The -> arrow represents a yield from, and each coroutine has its own stack frame (the frame's back pointer points left). The leftmost coroutine is the one you pass to Task(); the rightmost one is the one whose code is currently running. When it blocks for I/O, the entire stack is suspended; the Task object is given to the scheduler for resumption when the I/O completes. I'm drawing a '[' to the left of the Task because it is a definite end point; I'm drawing a ')' to the right of the last coroutine because whenever the coroutine uses yield from another one gets added to the right. When a coroutine blocks for a Future, it looks like this: [Task: coroutine -> coroutine -> ... -> coroutine -> Future] (I'm using ']' here to suggest that the Future is also an end point.) When it blocks for a Task, it ends up looking like this: [Task 1: coroutine -> ... -> coroutine -> [Task 2: coroutine -> ... -> coroutine)]
The Tasks section in the PEP says that a bare coroutine (is this the same as the previously defined "coroutine object"?)
Yes.
No. This takes a lot of time to wrap your head around but it is important to get this. This is because "yield from" is built into the language, and because of the way it is defined to behave. Suppose you have this: def inner(): yield 1 yield 2 def outer(): yield 'A' yield from inner() yield 'B' def main(): # Not a generator -- no yield in sight! for x in outer(): print(x) The output of calling main() is as follows: A 1 2 B There is no scheduler in sight, this is basic Python 3. The Python 2 equivalent would have the middle line of outer() replaced by for x in inner(): yield x (It's more complicated when 'yield from' is used as an expression and when sending values or throwing exceptions into the outer generator, but that doesn't matter for this part of the explanation.) Given that a coroutine function (despite being marked with @tulip.coroutine) is just a generator, when one coroutine invokes another via 'yield from', the scheduler doesn't find out about this at all. However, if a coroutine uses 'yield' instead of 'yield from', the scheduler *does* hear about it. The mechanism for this is best understood by looking at the Python 2 equivalent: each arrow in my diagrams stands for 'yield from', which you can replace by a for loop yielding each value, and thus the value yielded by the innermost coroutine ends up being yielded by the outermost one to the scheduler. The trick is that 'yield from' is implemented more efficiently that the equivalent for loop. Another thing to keep in mind is that when you use yield from with a Future (or a Task, which is a subclass of Future), the Future has an __iter__() method that uses 'yield' (*not* 'yield from') to signal the scheduler that it is waiting for some I/O. (There's debate about whether you tell the scheduler what kind of I/O it should perform before invoking 'yield' or as a parameter to 'yield', but that's immaterial for understanding this part of the explanation.)
I could imagine that a coroutine object is implemented as a C object for performance,
Kind of -- the transfer is built into the Python interpreter.
You'll have to ask a Twisted person about Deferreds; they have all kinds of extra useful functionality related to error handling and chaining. (Apparently Deferreds became popular in the JS world after they were introduced in Twisted.) I find Deferreds elusive, and my PEP won't have them. (Coroutines take their place as the preferred way to write user code.) AFAICT a Promise is more like a Future, which is a much simpler thing. Another difference between bare coroutines and Tasks: a bare coroutine *only* runs when another coroutine that is running is waiting for it using 'yield from'. But a coroutine wrapped in a Task will be run by the schedulereven when nobody is waiting for it. (In Kaa's world, which is similar to Twisted's @inlineCallbacks, Monocle, and Google App Engine's NDB, every coroutine is wrapped in something like a task.) This is the reason why the par() operation needs to wrap bare coroutine arguments in Tasks.
Yes. That's probably something you shouldn't be doing though. Also, cancel() sets a flag on the Task that remains set, and when the coroutine suspends itself in response to the CancelledError, the scheduler will just throw the exception into it again. Or perhaps it should throw something that's harder to catch? There's some similarity with the close() method on generators introduced by PEP 342; this causes GeneratorExit to be thrown into the generator (if it's not terminated), and if the generator chooses to catch and ignore this, the generator is declared dead anyway.
Yeah, since you have a Task/Future at every level you are forced to do it that way.
Yeah, I think that the default behavior I sketched in my previous message is fine, and the user can implement other behaviors through a combination of Task wrappers, catching exceptions, and explicitly cancelling tasks.
Correct. If you have a simple stack-like usage pattern there's no need to introduce a Task; Tasks are useful if you want to decouple the stacks, e.g. have two other places both wait for the same Task (or for some other Future, for that matter).
I hope my explanation above of the relationship between Tasks and bare coroutines helps. I can see how it gets confusing if you are used to thinking in terms of a system where there is always a Task involved when one coroutine waits for another.
Right. As I discussed previously, one of those tasks might be yielded by some
I think we're in violent agreement. :-) Kaa has similar constructs to allow yielding a collection of InProgress
Good point. I'd forgotten about this while writing the PEP, but Tulip v1 has this. The way to spell it is a little awkward and I could use some fresh ideas though. In Tulip v1 you can write ready_tasks = yield from wait_any(set_of_tasks) The result ready_tasks is a set of tasks that are done; it has at least one element. This is a generalization of ready_tasks = yield from wait_for(N, set_of_tasks) which returns a set of size at least N done tasks; set N to the length of the input to implement waiting for all. But the semantics of always returning a set (even when N == 1) are somewhat awkward, and ideally you probably want something that you can call in a loop until all tasks are done, e.g. todo = <initial list of tasks> while todo: result = yield from wait_one(todo) <use result> Here wait_one(todo) blocks until at least one task in todo is done, then removes it from todo, and returns its result (or raises its exception).
Actually in Tulip you never combine result() with 'yield from' and you never use timeout=N with result(); this line would be written as follows: result = yield from Task(othercoroutine(), timeout=2)
When do you use that? We've had long discussions about yield vs. yield-from. The latter is way
These semantics are built into the language as of Python 3.3; sure, it's a quality of implementation issue to make it as fast as possible, but the stack trace semantics are not optional, and if CPython can make it efficient then other implementations will try to compete by making it even more efficient. :-) There are still some optimizations possibly beyond what Python 3.3 currently does, maybe 3.3.1 or 3.4 will speed it up even more. (In particular, in the ideal implementation, a yield at a deeply nested coroutine should reach the caller of the outermost coroutine in O(1) time rather than O(N) where N is the stack depth. I think it is currently O(N) with a rather small constant factor.
That's a good optimization if your semantics require a Task at every level. But IIRC (from implementing something like this myself for NDB) it is quite subtle to get it right in all edge cases. And you still have at least two Python function invocations for every level of coroutine invocation.
Misunderstood what? You are describing Kaa here. :-)
I'm sorry I don't have a reference implementation available yet. I hope to finish one before Christmas.
(Aside: I can never get used to that terminology; I am too used to the UNIX meaning of "signal". It sounds like a publish-subscribe mechanism.)
That example is a little beyond my comprehension. I'm guessing though that you could probably cobble something like this together from the wait_one() primitive I described above. Or perhaps we need a set of synchronization primitives similar to those provided by threading.py: Lock, Condition, Semaphore, Event, Barrier, and some variations.
That's actually a clever idea: log the exception when the Task object is destroyed if it hasn't been raised (from result()) or inspected (using exception()) at least once. I know these have been haunting me in NDB -- it logs all, some or none of the exceptions depending on the log settings, but that's not right, and your approach is much better. So it may come down to implementation cleverness to try and GC Task objects sooner rather than later -- which will also depend on the Python implementation. In the end, debugging convenience cannot help but depend on the implementation. -- --Guido van Rossum (python.org/~guido)

Hi Jason, I don't think you've missed anything. I had actually planned to keep PEP 3156 unpublished for a bit longer, since I'm not done writing the reference implementation -- I'm sure that many of the issues currently marked open or TBD will be resolved that way. There hasn't been any public discussion since the last threads on python-ideas some weeks ago -- however I've met in person with some Twisted folks and exchanged private emails with some other interested parties. You've also correctly noticed that the PEP is weakest in the area of cancellation (and timeouts aren't even mentioned in the current draft). I'm glad you have some experience in this area, and I'll try to study your solutions and suggestions in more detail soon. For integration with threads, I'm thinking that the PEP currently has the minimum needed with wrap_future() and run_in_executor() -- but I'll read your link on threads and see what I may be missing. (More later, but I don't want you to think you posted into a black hole!) --Guido On Sat, Dec 15, 2012 at 5:36 PM, Jason Tackaberry <tack@urandom.ca> wrote:
-- --Guido van Rossum (python.org/~guido)

On Sat, 15 Dec 2012 21:37:15 -0800 Guido van Rossum <guido@python.org> wrote:
For the record, have you looked at the pyuv API? It's rather nicely orthogonal, although it lacks a way to stop the event loop. https://pyuv.readthedocs.org/en Regards Antoine.

Antoine Pitrou <solipsis@...> writes:
That link gives a 404, but you can use https://pyuv.readthedocs.org/en/latest/ Regards, Vinay Sajip

I have to ask someone who has experience with libuv to comment on my PEP -- those docs are very low level and don't explain how things work together or why features are needed. I also have to explain my goals and motivations. But not now. --Guido On Sunday, December 16, 2012, Vinay Sajip wrote:
-- --Guido van Rossum (on iPad)

On Dec 16, 2012, at 11:16 AM, Antoine Pitrou <solipsis@pitrou.net> wrote:
For now the only way to stop the event loop is either to stop any events in trigger its execution in a loop: while True: if loop.run_once(): … continue If you have any questions about it I can help. I plan to use it in my own lib and already use it in gaffer [1]. One of the advantage of libuv is its multi-platform support: on windows it is using IOCP, on unix, plain sockets apis , etc… - benoît [1] http://github.com/benoitc/gaffer

On Sat, Dec 15, 2012 at 5:36 PM, Jason Tackaberry <tack@urandom.ca> wrote:
- You can't cancel a coroutine; however you can cancel a Task, which is a Future wrapping a stack of coroutines linked via yield-from. - Cancellation only takes effect when a task is suspended. - When you cancel a Task, the most deeply nested coroutine (the one that caused it to be suspended) receives a special exception (I propose to reuse concurrent.futures.CancelledError from PEP 3148). If it doesn't catch this it bubbles all the way to the Task, and then out from there. - However when a coroutine in one Task uses yield-from to wait for another Task, the latter does not automatically get cancelled. So this is a difference between "yield from foo()" and "yield from Task(foo())", which otherwise behave pretty similarly. Of course the first Task could catch the exception and cancel the second task -- that is its responsibility though and not the default behavior. - PEP 3156 has a par() helper which lets you block for multiple tasks/coroutines in parallel. It takes arguments which are either coroutines, Tasks, or other Futures; it wraps the coroutines in Tasks to run them independently an just waits for the other arguments. Proposal: when the Task containing the par() call is cancelled, the par() call intercepts the cancellation and by default cancels those coroutines that were passed in "bare" but not the arguments that were passed in as Tasks or Futures. Some keyword argument to par() may be used to change this behavior to "cancel none" or "cancel all" (exact API spec TBD). 3156) the Task() constructor has an optional timeout argument. It works by scheduling a callback at the given time in the future, and the callback simply cancel the task (which is a no-op if the task has already completed). It works okay, except it generates tracebacks that are sometimes logged and sometimes not properly caught -- though some of that may be my messy test code. The exception raised by a timeout is the same CancelledError, which is somewhat confusing. I wonder if Task.cancel() shouldn't take an exception with which to cancel the task with. (TimeoutError in PEP 3148 has a different role, it is when the timeout on a specific wait expires, so e.g. fut.result(timeout=2) waits up to 2 seconds for fut to complete, and if not, the call raises TimeoutError, but the code running in the executor is unaffected.)
We've had long discussions about yield vs. yield-from. The latter is way more efficient and that's enough for me to push it through. When using yield, each yield causes you to bounce to the scheduler, which has to do a lot of work to decide what to do next, even if that is just resuming the suspended generator; and the scheduler is responsible for keeping track of the stack of generators. When using yield-from, calling another coroutine as a subroutine is almost free and doesn't involve the scheduler at all; thus it's much cheaper, and the scheduler can be simpler (doesn't need to keep track of the stack). Also stack traces and debugging are better.
These seem pretty esoteric and can probably implemented in user code if needed.
As I said, I think wait_for_future() and run_in_executor() in the PEP give you all you need. The @threaded decorator you propose is just sugar; if a user wants to take an existing API and convert it from a coroutine to threaded without requiring changes to the caller, they can just introduce a helper that is run in a thread with run_in_executor().
Thanks for your very useful contribution! Kaa looks like an interesting system. Is it ported to Python 3 yet? Maybe you could look into integrating with the PEP 3156 event loop and/or scheduler. -- --Guido van Rossum (python.org/~guido)

On 12-12-16 11:27 AM, Guido van Rossum wrote:
I'll just underline your statement that "you can't cancel a coroutine" here, since I'm referencing it later. This distinction between "bare" coroutines, Futures, and Tasks is a bit foreign to me, since in Kaa all coroutines return (a subclass of) InProgress objects. The Tasks section in the PEP says that a bare coroutine (is this the same as the previously defined "coroutine object"?) has much less overhead than a Task but it's not clear to me why that would be, as both would ultimately need to be managed by the scheduler, wouldn't they? I could imagine that a coroutine object is implemented as a C object for performance, and a Task is a Python class, and maybe that explains the difference. But then why differentiate between Future and Task (particularly because they have the same interface, so I can't draw an analogy with jQuery's Deferreds and Promises, where Promises are a restricted form of Deferreds for public consumption to attach callbacks).
* Cancellation only takes effect when a task is suspended.
Yes, this is intuitive.
So if the most deeply nested coroutine catches the CancelledError and doesn't reraise, it can prevent its cancellation? I took a similar appoach, except that coroutines can't abort their own cancellation, and whether or not the nested coroutines actually get cancelled depends on whether something else was interested in their result. Consider a coroutine chain where A yields B yields C yields D, and we do B.abort() * if only C was interested in D's result, then D will get an InProgressAborted raised inside it (at whatever point it's currently suspended). If something other than C was also waiting on D, D will not be affected * similarly, if only B was interested in C's result, then C will get an InProgressAborted raised inside it (at yield D). * B will get InProgressAborted raised inside it (at yield C) * for B, C and D, the coroutines will not be reentered and they are not allowed to yield a value that suggests they expect reentry. There's nothing a coroutine can do to prevent its own demise. * A will get an InProgressAborted raised inside it (at yield B) * In all the above cases, the InProgressAborted instance has an origin attribute that is B's InProgress object * Although B, C, and D are now aborted, A isn't aborted. It's allowed to yield again. * with Kaa, coroutines are abortable by default (so they are like Tasks always). But in this example, B can present C from being aborted by yielding C().noabort() There are quite a few scenarios to consider: A yields B and B is cancelled or raises; A yields B and A is cancelled or raises; A yields B, C yields B, and A is cancelled or raises; A yields B, C yields B, and A or C is cancelled or raises; A yields par(B,C,D) and B is cancelled or raises; etc, etc. In my experience, there's no one-size-fits-all behaviour, and the best we can do is have sensible default behaviour with some API (different functions, kwargs, etc.) to control the cancellation propagation logic.
Ok, so nested bare coroutines will get cancelled implicitly, but nested Tasks won't? I'm having a bit of difficulty with this one. You said that coroutines can't be cancelled, but Tasks can be. But here, if they are being yielded, the opposite behaviour applies: yielded coroutines /are/ cancelled if a Task is cancelled, but yielded tasks /aren't/. Or have I misunderstood?
Here again, par() would cancel a bare coroutine but not Tasks. It's consistent with your previous bullet but seems to contradict your first bullet that you can't cancel a coroutine. I guess the distinction is you can't explicitly cancel a coroutine, but coroutines can be implicitly cancelled? As I discussed previously, one of those tasks might be yielded by some other active coroutine, and so cancelling it may not be the right thing to do. Being able to control this behaviour is important, whether that's a par() kwarg, or special method like noabort() that constructs an unabortable Task instance. Kaa has similar constructs to allow yielding a collection of InProgress objects (whatever they might represent: coroutines, threaded functions, etc.). In particular, it allows you to yield multiple tasks and resume when ALL of them complete (InProgressAll), or when ANY of them complete (InProgressAny). For example: @kaa.coroutine() def is_any_host_up(*hosts): try: # ping() is a coroutine yield kaa.InProgressAny(ping(host) for host in hosts).timeout(5, abort=True) except kaa.TimeoutException: yield False else: yield True More details here: http://api.freevo.org/kaa-base/async/inprogress.html#inprogress-collections From what I understand of the proposed par() it would require//ALL of the supplied futures to complete, but there are many use-cases for the ANY variant as well.
FWIW, the equivalent in Kaa which is InProgress.abort() does take an optional exception, which must subclass InProgressAborted. If None, a new InProgressAborted is created. InProgress.timeout(t) will start a timer that invokes InProgress.abort(TimeoutException()) (TimeoutException subclasses InProgressAborted). It sounds like your proposed implementation works like: @tulip.coroutine() def foo(): try: result = yield from Task(othercoroutine()).result(timeout=2) except TimeoutError: # ... othercoroutine() still lives on I think Kaa's syntax is cleaner but it seems functionally the same: @kaa.coroutine() def foo(): try: result = yield othercoroutine().timeout(2) except kaa.TimeoutException: # ... othercoroutine() still lives on It's also possible to conveniently ensure that othercoroutine() is aborted if the timeout elapses: try: result = yield othercoroutine().timeout(2, abort=True) except kaa.TimeoutException: # ... othercoroutine() is aborted
But this sounds like a consequence of a particular implementation, isn't it? A @kaa.coroutine() decorated function is entered right away when invoked, and the decorator logic does as much as it can until the underlying generator yields an unfinished InProgress that needs to wait for (or kaa.NotFinished). Once it yields, /then/ the decorator sets up the necessary hooks with the scheduler / event loop. This means you can nest a stack of coroutines without involving the scheduler until something truly asynchronous needs to take place. Have I misunderstood?
I'm fine with that, provided the flexibility is there to allow for it.
Also works for me. :)
Kaa does work with Python 3, yes, although it still lacks very much needed unit tests so I'm not completely confident it has the same functional coverage as Python 2. I'm definitely interested in having it conform to whatever shakes out of PEP 3156, which is why I'm speaking up now. :) I've a couple other subjects I should bring up: Tasks/Futures as "signals": it's often necessary to be able to resume a coroutine based on some condition other than e.g. any IO tasks it's waiting on. For example, in one application, I have a (POLICY_SINGLETON) coroutine that works off a download queue. If there's nothing in the queue, it's suspended at a yield. It's the coroutine equivalent of a dedicated thread. [1] It must be possible to "wake" the queue manager when I enqueue a job for it. Kaa has this notion of "signals" which is similar to the gtk+ style of signals in that you can attach callbacks to them and emit them. Signals can be represented as InProgress objects, which means they can be yielded from coroutines and used in InProgressAny/All objects. So my download manager coroutine can yield an InProgressAny of all the active download coroutines /and/ the "new job enqueued" signal, and execution will resume as long as any of those conditions are met. Is there anything in your current proposal that would allow for this use-case? [1] https://github.com/jtackaberry/stagehand/blob/master/src/manager.py#L390 Another pain point for me has been this notion of unhandled asynchronous exceptions. Asynchronous tasks are represented as an InProgress object, and if a task fails, accessing InProgress.result will raise the exception at which point it's considered handled. This attribute access could happen at any time during the lifetime of the InProgress object, outside the task's call stack. The desirable behaviour is that when the InProgress object is destroyed, if there's an exception attached to it from a failed task that hasn't been accessed, we should output the stack as an unhandled exception. In Kaa, I do this with a weakref destroy callback, but this isn't ideal because with GC, the InProgress might not be destroyed until well after the exception is relevant. I make every effort to remove reference cycles and generally get the InProgress object destroyed as early as possible, but this changes subtly between Python versions. How will unhandled asynchronous exceptions be handled with tulip? Thanks! Jason.

On Sun, Dec 16, 2012 at 11:11 AM, Jason Tackaberry <tack@urandom.ca> wrote:
Task is a subclass of Future; a Future may wrap some I/O or some other system call, but a Task wraps a coroutine. Bare coroutines are introduced by PEP 380, so it's no surprise you have to get used to them. But trust me they are useful. I have a graphical representation in my head; drawing with a computer is not my strong point, but here's some ASCII art: [Task: coroutine -> coroutine -> ... -> coroutine) The -> arrow represents a yield from, and each coroutine has its own stack frame (the frame's back pointer points left). The leftmost coroutine is the one you pass to Task(); the rightmost one is the one whose code is currently running. When it blocks for I/O, the entire stack is suspended; the Task object is given to the scheduler for resumption when the I/O completes. I'm drawing a '[' to the left of the Task because it is a definite end point; I'm drawing a ')' to the right of the last coroutine because whenever the coroutine uses yield from another one gets added to the right. When a coroutine blocks for a Future, it looks like this: [Task: coroutine -> coroutine -> ... -> coroutine -> Future] (I'm using ']' here to suggest that the Future is also an end point.) When it blocks for a Task, it ends up looking like this: [Task 1: coroutine -> ... -> coroutine -> [Task 2: coroutine -> ... -> coroutine)]
The Tasks section in the PEP says that a bare coroutine (is this the same as the previously defined "coroutine object"?)
Yes.
No. This takes a lot of time to wrap your head around but it is important to get this. This is because "yield from" is built into the language, and because of the way it is defined to behave. Suppose you have this: def inner(): yield 1 yield 2 def outer(): yield 'A' yield from inner() yield 'B' def main(): # Not a generator -- no yield in sight! for x in outer(): print(x) The output of calling main() is as follows: A 1 2 B There is no scheduler in sight, this is basic Python 3. The Python 2 equivalent would have the middle line of outer() replaced by for x in inner(): yield x (It's more complicated when 'yield from' is used as an expression and when sending values or throwing exceptions into the outer generator, but that doesn't matter for this part of the explanation.) Given that a coroutine function (despite being marked with @tulip.coroutine) is just a generator, when one coroutine invokes another via 'yield from', the scheduler doesn't find out about this at all. However, if a coroutine uses 'yield' instead of 'yield from', the scheduler *does* hear about it. The mechanism for this is best understood by looking at the Python 2 equivalent: each arrow in my diagrams stands for 'yield from', which you can replace by a for loop yielding each value, and thus the value yielded by the innermost coroutine ends up being yielded by the outermost one to the scheduler. The trick is that 'yield from' is implemented more efficiently that the equivalent for loop. Another thing to keep in mind is that when you use yield from with a Future (or a Task, which is a subclass of Future), the Future has an __iter__() method that uses 'yield' (*not* 'yield from') to signal the scheduler that it is waiting for some I/O. (There's debate about whether you tell the scheduler what kind of I/O it should perform before invoking 'yield' or as a parameter to 'yield', but that's immaterial for understanding this part of the explanation.)
I could imagine that a coroutine object is implemented as a C object for performance,
Kind of -- the transfer is built into the Python interpreter.
You'll have to ask a Twisted person about Deferreds; they have all kinds of extra useful functionality related to error handling and chaining. (Apparently Deferreds became popular in the JS world after they were introduced in Twisted.) I find Deferreds elusive, and my PEP won't have them. (Coroutines take their place as the preferred way to write user code.) AFAICT a Promise is more like a Future, which is a much simpler thing. Another difference between bare coroutines and Tasks: a bare coroutine *only* runs when another coroutine that is running is waiting for it using 'yield from'. But a coroutine wrapped in a Task will be run by the schedulereven when nobody is waiting for it. (In Kaa's world, which is similar to Twisted's @inlineCallbacks, Monocle, and Google App Engine's NDB, every coroutine is wrapped in something like a task.) This is the reason why the par() operation needs to wrap bare coroutine arguments in Tasks.
Yes. That's probably something you shouldn't be doing though. Also, cancel() sets a flag on the Task that remains set, and when the coroutine suspends itself in response to the CancelledError, the scheduler will just throw the exception into it again. Or perhaps it should throw something that's harder to catch? There's some similarity with the close() method on generators introduced by PEP 342; this causes GeneratorExit to be thrown into the generator (if it's not terminated), and if the generator chooses to catch and ignore this, the generator is declared dead anyway.
Yeah, since you have a Task/Future at every level you are forced to do it that way.
Yeah, I think that the default behavior I sketched in my previous message is fine, and the user can implement other behaviors through a combination of Task wrappers, catching exceptions, and explicitly cancelling tasks.
Correct. If you have a simple stack-like usage pattern there's no need to introduce a Task; Tasks are useful if you want to decouple the stacks, e.g. have two other places both wait for the same Task (or for some other Future, for that matter).
I hope my explanation above of the relationship between Tasks and bare coroutines helps. I can see how it gets confusing if you are used to thinking in terms of a system where there is always a Task involved when one coroutine waits for another.
Right. As I discussed previously, one of those tasks might be yielded by some
I think we're in violent agreement. :-) Kaa has similar constructs to allow yielding a collection of InProgress
Good point. I'd forgotten about this while writing the PEP, but Tulip v1 has this. The way to spell it is a little awkward and I could use some fresh ideas though. In Tulip v1 you can write ready_tasks = yield from wait_any(set_of_tasks) The result ready_tasks is a set of tasks that are done; it has at least one element. This is a generalization of ready_tasks = yield from wait_for(N, set_of_tasks) which returns a set of size at least N done tasks; set N to the length of the input to implement waiting for all. But the semantics of always returning a set (even when N == 1) are somewhat awkward, and ideally you probably want something that you can call in a loop until all tasks are done, e.g. todo = <initial list of tasks> while todo: result = yield from wait_one(todo) <use result> Here wait_one(todo) blocks until at least one task in todo is done, then removes it from todo, and returns its result (or raises its exception).
Actually in Tulip you never combine result() with 'yield from' and you never use timeout=N with result(); this line would be written as follows: result = yield from Task(othercoroutine(), timeout=2)
When do you use that? We've had long discussions about yield vs. yield-from. The latter is way
These semantics are built into the language as of Python 3.3; sure, it's a quality of implementation issue to make it as fast as possible, but the stack trace semantics are not optional, and if CPython can make it efficient then other implementations will try to compete by making it even more efficient. :-) There are still some optimizations possibly beyond what Python 3.3 currently does, maybe 3.3.1 or 3.4 will speed it up even more. (In particular, in the ideal implementation, a yield at a deeply nested coroutine should reach the caller of the outermost coroutine in O(1) time rather than O(N) where N is the stack depth. I think it is currently O(N) with a rather small constant factor.
That's a good optimization if your semantics require a Task at every level. But IIRC (from implementing something like this myself for NDB) it is quite subtle to get it right in all edge cases. And you still have at least two Python function invocations for every level of coroutine invocation.
Misunderstood what? You are describing Kaa here. :-)
I'm sorry I don't have a reference implementation available yet. I hope to finish one before Christmas.
(Aside: I can never get used to that terminology; I am too used to the UNIX meaning of "signal". It sounds like a publish-subscribe mechanism.)
That example is a little beyond my comprehension. I'm guessing though that you could probably cobble something like this together from the wait_one() primitive I described above. Or perhaps we need a set of synchronization primitives similar to those provided by threading.py: Lock, Condition, Semaphore, Event, Barrier, and some variations.
That's actually a clever idea: log the exception when the Task object is destroyed if it hasn't been raised (from result()) or inspected (using exception()) at least once. I know these have been haunting me in NDB -- it logs all, some or none of the exceptions depending on the log settings, but that's not right, and your approach is much better. So it may come down to implementation cleverness to try and GC Task objects sooner rather than later -- which will also depend on the Python implementation. In the end, debugging convenience cannot help but depend on the implementation. -- --Guido van Rossum (python.org/~guido)
participants (5)
-
Antoine Pitrou
-
Benoit Chesneau
-
Guido van Rossum
-
Jason Tackaberry
-
Vinay Sajip