[Python-ideas] Late to the async party (PEP 3156)

Sun Dec 16 23:23:51 CET 2012

On Sun, Dec 16, 2012 at 11:11 AM, Jason Tackaberry <tack at urandom.ca> wrote:

>  On 12-12-16 11:27 AM, Guido van Rossum wrote:
>
> The PEP is definitely weak. Here are some thoughts/proposals though:
>
>    - You can't cancel a coroutine; however you can cancel a Task, which
>    is a Future wrapping a stack of coroutines linked via yield-from.
>
>
> I'll just underline your statement that "you can't cancel a coroutine"
> here, since I'm referencing it later.
>
> This distinction between "bare" coroutines, Futures, and Tasks is a bit
> foreign to me, since in Kaa all coroutines return (a subclass of)
> InProgress objects.
>

Task is a subclass of Future; a Future may wrap some I/O or some other
system call, but a Task wraps a coroutine. Bare coroutines are introduced
by PEP 380, so it's no surprise you have to get used to them. But trust me
they are useful.

I have a graphical representation in my head; drawing with a computer is
not my strong point, but here's some ASCII art:

  [Task: coroutine -> coroutine -> ... -> coroutine)

The -> arrow represents a yield from, and each coroutine has its own stack
frame (the frame's back pointer points left). The leftmost coroutine is the
one you pass to Task(); the rightmost one is the one whose code is
currently running. When it blocks for I/O, the entire stack is suspended;
the Task object is given to the scheduler for resumption when the I/O
completes.

I'm drawing a '[' to the left of the Task because it is a definite end
point; I'm drawing a ')' to the right of the last coroutine because
whenever the coroutine uses yield from another one gets added to the right.

When a coroutine blocks for a Future, it looks like this:

  [Task: coroutine -> coroutine -> ... -> coroutine -> Future]

(I'm using ']' here to suggest that the Future is also an end point.)

When it blocks for a Task, it ends up looking like this:

  [Task 1: coroutine -> ... -> coroutine -> [Task 2: coroutine -> ... ->
coroutine)]

>  The Tasks section in the PEP says that a bare coroutine (is this the same
> as the previously defined "coroutine object"?)
>

Yes.

>  has much less overhead than a Task but it's not clear to me why that
> would be, as both would ultimately need to be managed by the scheduler,
> wouldn't they?
>

No.

This takes a lot of time to wrap your head around but it is important to
get this. This is because "yield from" is built into the language, and
because of the way it is defined to behave. Suppose you have this:

  def inner():
    yield 1
    yield 2

  def outer():
    yield 'A'
    yield from inner()
    yield 'B'

  def main():  # Not a generator -- no yield in sight!
    for x in outer():
      print(x)

The output of calling main() is as follows:

A
1
2
B

There is no scheduler in sight, this is basic Python 3. The Python 2
equivalent would have the middle line of outer() replaced by

    for x in inner():
      yield x

(It's more complicated when 'yield from' is used as an expression and when
sending values or throwing exceptions into the outer generator, but that
doesn't matter for this part of the explanation.)

Given that a coroutine function (despite being marked with
@tulip.coroutine) is just a generator, when one coroutine invokes another
via 'yield from', the scheduler doesn't find out about this at all.

However, if a coroutine uses 'yield' instead of 'yield from', the scheduler
*does* hear about it. The mechanism for this is best understood by looking
at the Python 2 equivalent: each arrow in my diagrams stands for 'yield
from', which you can replace by a for loop yielding each value, and thus
the value yielded by the innermost coroutine ends up being yielded by the
outermost one to the scheduler.

The trick is that 'yield from' is implemented more efficiently that the
equivalent for loop.

Another thing to keep in mind is that when you use yield from with a Future
(or a Task, which is a subclass of Future), the Future has an __iter__()
method that uses 'yield' (*not* 'yield from') to signal the scheduler that
it is waiting for some I/O. (There's debate about whether you tell the
scheduler what kind of I/O it should perform before invoking 'yield' or as
a parameter to 'yield', but that's immaterial for understanding this part
of the explanation.)

> I could imagine that a coroutine object is implemented as a C object for
> performance,
>

Kind of -- the transfer is built into the Python interpreter.

> and a Task is a Python class, and maybe that explains the difference.  But
> then why differentiate between Future and Task (particularly because they
> have the same interface, so I can't draw an analogy with jQuery's Deferreds
> and Promises, where Promises are a restricted form of Deferreds for public
> consumption to attach callbacks).
>

You'll have to ask a Twisted person about Deferreds; they have all kinds of
extra useful functionality related to error handling and chaining.
(Apparently Deferreds became popular in the JS world after they were
introduced in Twisted.) I find Deferreds elusive, and my PEP won't have
them. (Coroutines take their place as the preferred way to write user
code.) AFAICT a Promise is more like a Future, which is a much simpler
thing.

Another difference between bare coroutines and Tasks: a bare coroutine
*only* runs when another coroutine that is running is waiting for it using
'yield from'. But a coroutine wrapped in a Task will be run by the
schedulereven when nobody is waiting for it. (In Kaa's world, which is
similar to Twisted's @inlineCallbacks, Monocle, and Google App Engine's
NDB, every coroutine is wrapped in something like a task.)

This is the reason why the par() operation needs to wrap bare coroutine
arguments in Tasks.

>
>    - Cancellation only takes effect when a task is suspended.
>
>
> Yes, this is intuitive.
>
>
>
>
>    - When you cancel a Task, the most deeply nested coroutine (the one
>    that caused it to be suspended) receives a special exception (I propose to
>    reuse concurrent.futures.CancelledError from PEP 3148). If it doesn't catch
>    this it bubbles all the way to the Task, and then out from there.
>
>
> So if the most deeply nested coroutine catches the CancelledError and
> doesn't reraise, it can prevent its cancellation?
>

Yes. That's probably something you shouldn't be doing though. Also,
cancel() sets a flag on the Task that remains set, and when the coroutine
suspends itself in response to the CancelledError, the scheduler will just
throw the exception into it again.

Or perhaps it should throw something that's harder to catch? There's some
similarity with the close() method on generators introduced by PEP 342;
this causes GeneratorExit to be thrown into the generator (if it's not
terminated), and if the generator chooses to catch and ignore this, the
generator is declared dead anyway.

> I took a similar appoach, except that coroutines can't abort their own
> cancellation, and whether or not the nested coroutines actually get
> cancelled depends on whether something else was interested in their result.
>

Yeah, since you have a Task/Future at every level you are forced to do it
that way.

>  Consider a coroutine chain where A yields B yields C yields D, and we do
> B.abort()
>
>    - if only C was interested in D's result, then D will get an
>    InProgressAborted raised inside it (at whatever point it's currently
>    suspended).  If something other than C was also waiting on D, D will not be
>    affected
>     - similarly, if only B was interested in C's result, then C will get
>    an InProgressAborted raised inside it (at yield D).
>     - B will get InProgressAborted raised inside it (at yield C)
>     - for B, C and D, the coroutines will not be reentered and they are
>    not allowed to yield a value that suggests they expect reentry.  There's
>    nothing a coroutine can do to prevent its own demise.
>    - A will get an InProgressAborted raised inside it (at yield B)
>    - In all the above cases, the InProgressAborted instance has an origin
>    attribute that is B's InProgress object
>    - Although B, C, and D are now aborted, A isn't aborted.  It's allowed
>    to yield again.
>    - with Kaa, coroutines are abortable by default (so they are like
>    Tasks always).  But in this example, B can present C from being aborted by
>    yielding C().noabort()
>
>
> There are quite a few scenarios to consider: A yields B and B is cancelled
> or raises; A yields B and A is cancelled or raises; A yields B, C yields B,
> and A is cancelled or raises; A yields B, C yields B, and A or C is
> cancelled or raises; A yields par(B,C,D) and B is cancelled or raises; etc,
> etc.
>
> In my experience, there's no one-size-fits-all behaviour, and the best we
> can do is have sensible default behaviour with some API (different
> functions, kwargs, etc.) to control the cancellation propagation logic.
>

Yeah, I think that the default behavior I sketched in my previous message
is fine, and the user can implement other behaviors through a combination
of Task wrappers, catching exceptions, and explicitly cancelling tasks.

>
>    - However when a coroutine in one Task uses yield-from to wait for
>    another Task, the latter does not automatically get cancelled. So this is a
>    difference between "yield from foo()" and "yield from Task(foo())", which
>    otherwise behave pretty similarly. Of course the first Task could catch the
>    exception and cancel the second task -- that is its responsibility though
>    and not the default behavior.
>
>
> Ok, so nested bare coroutines will get cancelled implicitly, but nested
> Tasks won't?
>

Correct. If you have a simple stack-like usage pattern there's no need to
introduce a Task; Tasks are useful if you want to decouple the stacks, e.g.
have two other places both wait for the same Task (or for some other
Future, for that matter).

> I'm having a bit of difficulty with this one.  You said that coroutines
> can't be cancelled, but Tasks can be.  But here, if they are being yielded,
> the opposite behaviour applies: yielded coroutines *are* cancelled if a
> Task is cancelled, but yielded tasks *aren't*.
>
> Or have I misunderstood?
>

I hope my explanation above of the relationship between Tasks and bare
coroutines helps. I can see how it gets confusing if you are used to
thinking in terms of a system where there is always a Task involved when
one coroutine waits for another.

>    - PEP 3156 has a par() helper which lets you block for multiple
>    tasks/coroutines in parallel. It takes arguments which are either
>    coroutines, Tasks, or other Futures; it wraps the coroutines in Tasks to
>    run them independently an just waits for the other arguments. Proposal:
>    when the Task containing the par() call is cancelled, the par() call
>    intercepts the cancellation and by default cancels those coroutines that
>    were passed in "bare" but not the arguments that were passed in as Tasks or
>    Futures. Some keyword argument to par() may be used to change this behavior
>    to "cancel none" or "cancel all" (exact API spec TBD).
>
>
> Here again, par() would cancel a bare coroutine but not Tasks.  It's
> consistent with your previous bullet but seems to contradict your first
> bullet that you can't cancel a coroutine.
>
> I guess the distinction is you can't explicitly cancel a coroutine, but
> coroutines can be implicitly cancelled?
>

Right.

 As I discussed previously, one of those tasks might be yielded by some
> other active coroutine, and so cancelling it may not be the right thing to
> do.  Being able to control this behaviour is important, whether that's a
> par() kwarg, or special method like noabort() that constructs an
> unabortable Task instance.
>

I think we're in violent agreement. :-)

 Kaa has similar constructs to allow yielding a collection of InProgress
> objects (whatever they might represent: coroutines, threaded functions,
> etc.).  In particular, it allows you to yield multiple tasks and resume
> when ALL of them complete (InProgressAll), or when ANY of them complete
> (InProgressAny).  For example:
>
>     @kaa.coroutine()
>     def is_any_host_up(*hosts):
>         try:
>             # ping() is a coroutine
>             yield kaa.InProgressAny(ping(host) for host in hosts).timeout(5, abort=True)
>         except kaa.TimeoutException:
>             yield False
>         else:
>             yield True
>
>
> More details here:
>
>
> http://api.freevo.org/kaa-base/async/inprogress.html#inprogress-collections
>
> From what I understand of the proposed par() it would require* *ALL of
> the supplied futures to complete, but there are many use-cases for the ANY
> variant as well.
>

Good point. I'd forgotten about this while writing the PEP, but Tulip v1
has this. The way to spell it is a little awkward and I could use some
fresh ideas though. In Tulip v1 you can write

  ready_tasks = yield from wait_any(set_of_tasks)

The result ready_tasks is a set of tasks that are done; it has at least one
element. This is a generalization of

  ready_tasks = yield from wait_for(N, set_of_tasks)

which returns a set of size at least N done tasks; set N to the length of
the input to implement waiting for all. But the semantics of always
returning a set (even when N == 1) are somewhat awkward, and ideally you
probably want something that you can call in a loop until all tasks are
done, e.g.

  todo = <initial list of tasks>
  while todo:
    result = yield from wait_one(todo)
    <use result>

Here wait_one(todo) blocks until at least one task in todo is done, then
removes it from todo, and returns its result (or raises its exception).

>  Interesting. In Tulip v1 (the experimental version I wrote before PEP
> 3156) the Task() constructor has an optional timeout argument. It works by
> scheduling a callback at the given time in the future, and the callback
> simply cancel the task (which is a no-op if the task has already
> completed). It works okay, except it generates tracebacks that are
> sometimes logged and sometimes not properly caught -- though some of that
> may be my messy test code. The exception raised by a timeout is the same
> CancelledError, which is somewhat confusing. I wonder if Task.cancel()
> shouldn't take an exception with which to cancel the task with.
> (TimeoutError in PEP 3148 has a different role, it is when the timeout on a
> specific wait expires, so e.g. fut.result(timeout=2) waits up to 2 seconds
> for fut to complete, and if not, the call raises TimeoutError, but the code
> running in the executor is unaffected.)
>
>
> FWIW, the equivalent in Kaa which is InProgress.abort() does take an
> optional exception, which must subclass InProgressAborted.  If None, a new
> InProgressAborted is created.  InProgress.timeout(t) will start a timer
> that invokes InProgress.abort(TimeoutException()) (TimeoutException
> subclasses InProgressAborted).
>
> It sounds like your proposed implementation works like:
>
>    @tulip.coroutine()
>    def foo():
>       try:
>          result = yield from Task(othercoroutine()).result(timeout=2)
>
>
Actually in Tulip you never combine result() with 'yield from' and you
never use timeout=N with result(); this line would be written as follows:

    result = yield from Task(othercoroutine(), timeout=2)

>       except TimeoutError:
>          # ... othercoroutine() still lives on
>
>
> I think Kaa's syntax is cleaner but it seems functionally the same:
>
>    @kaa.coroutine()
>    def foo():
>       try:
>          result = yield othercoroutine().timeout(2)
>       except kaa.TimeoutException:
>          # ... othercoroutine() still lives on
>
>
> It's also possible to conveniently ensure that othercoroutine() is aborted
> if the timeout elapses:
>
>    try:
>       result = yield othercoroutine().timeout(2, abort=True)
>    except kaa.TimeoutException:
>       # ... othercoroutine() is aborted
>
>
When do you use that?

 We've had long discussions about yield vs. yield-from. The latter is way
> more efficient and that's enough for me to push it through. When using
> yield, each yield causes you to bounce to the scheduler, which has to do a
> lot of work to decide what to do next, even if that is just resuming the
> suspended generator; and the scheduler is responsible for keeping track of
> the stack of generators. When using yield-from, calling another coroutine
> as a subroutine is almost free and doesn't involve the scheduler at all;
> thus it's much cheaper, and the scheduler can be simpler (doesn't need to
> keep track of the stack). Also stack traces and debugging are better.
>
>
> But this sounds like a consequence of a particular implementation, isn't
> it?
>

These semantics are built into the language as of Python 3.3; sure, it's a
quality of implementation issue to make it as fast as possible, but the
stack trace semantics are not optional, and if CPython can make it
efficient then other implementations will try to compete by making it even
more efficient. :-)

There are still some optimizations possibly beyond what Python 3.3
currently does, maybe 3.3.1 or 3.4 will speed it up even more. (In
particular, in the ideal implementation, a yield at a deeply nested
coroutine should reach the caller of the outermost coroutine in O(1) time
rather than O(N) where N is the stack depth. I think it is currently O(N)
with a rather small constant factor.

> A @kaa.coroutine() decorated function is entered right away when invoked,
> and the decorator logic does as much as it can until the underlying
> generator yields an unfinished InProgress that needs to wait for (or
> kaa.NotFinished).  Once it yields, *then* the decorator sets up the
> necessary hooks with the scheduler / event loop.
>

That's a good optimization if your semantics require a Task at every level.
But IIRC (from implementing something like this myself for NDB) it is quite
subtle to get it right in all edge cases. And you still have at least two
Python function invocations for every level of coroutine invocation.

> This means you can nest a stack of coroutines without involving the
> scheduler until something truly asynchronous needs to take place.
>
> Have I misunderstood?
>

Misunderstood what? You are describing Kaa here. :-)

>
>>    - coroutines can have certain policies that control invocation
>>    behaviour.  The most obvious ones to describe are POLICY_SYNCHRONIZED which
>>    ensures that multiple invocations of the same coroutine are serialized, and
>>    POLICY_SINGLETON which effectively ignores subsequent invocations if it's
>>    already running
>>    - it is possible to have a special progress object passed into the
>>    coroutine function so that the coroutine's progress can be communicated to
>>    an outside observer
>>
>>
>  These seem pretty esoteric and can probably implemented in user code if
> needed.
>
>
> I'm fine with that, provided the flexibility is there to allow for it.
>
>
>
>  As I said, I think wait_for_future() and run_in_executor() in the PEP
> give you all you need. The @threaded decorator you propose is just sugar;
> if a user wants to take an existing API and convert it from a coroutine to
> threaded without requiring changes to the caller, they can just introduce a
> helper that is run in a thread with run_in_executor().
>
>
> Also works for me. :)
>
>
>
>  Thanks for your very useful contribution! Kaa looks like an interesting
> system. Is it ported to Python 3 yet? Maybe you could look into integrating
> with the PEP 3156 event loop and/or scheduler.
>
>
> Kaa does work with Python 3, yes, although it still lacks very much needed
> unit tests so I'm not completely confident it has the same functional
> coverage as Python 2.
>
> I'm definitely interested in having it conform to whatever shakes out of
> PEP 3156, which is why I'm speaking up now. :)
>

I'm sorry I don't have a reference implementation available yet. I hope to
finish one before Christmas.

> I've a couple other subjects I should bring up:
>
> Tasks/Futures as "signals": it's often necessary to be able to resume a
> coroutine based on some condition other than e.g. any IO tasks it's waiting
> on.  For example, in one application, I have a (POLICY_SINGLETON) coroutine
> that works off a download queue.  If there's nothing in the queue, it's
> suspended at a yield.  It's the coroutine equivalent of a dedicated thread.
> [1]
>
> It must be possible to "wake" the queue manager when I enqueue a job for
> it.  Kaa has this notion of "signals" which is similar to the gtk+ style of
> signals in that you can attach callbacks to them and emit them.  Signals
> can be represented as InProgress objects, which means they can be yielded
> from coroutines and used in InProgressAny/All objects.
>

(Aside: I can never get used to that terminology; I am too used to the UNIX
meaning of "signal". It sounds like a publish-subscribe mechanism.)

>
> So my download manager coroutine can yield an InProgressAny of all the
> active download coroutines *and* the "new job enqueued" signal, and
> execution will resume as long as any of those conditions are met.
>
> Is there anything in your current proposal that would allow for this
> use-case?
>
> [1]
> https://github.com/jtackaberry/stagehand/blob/master/src/manager.py#L390
>

That example is a little beyond my comprehension. I'm guessing though that
you could probably cobble something like this together from the wait_one()
primitive I described above. Or perhaps we need a set of synchronization
primitives similar to those provided by threading.py: Lock, Condition,
Semaphore, Event, Barrier, and some variations.

> Another pain point for me has been this notion of unhandled asynchronous
> exceptions.  Asynchronous tasks are represented as an InProgress object,
> and if a task fails, accessing InProgress.result will raise the exception
> at which point it's considered handled.  This attribute access could happen
> at any time during the lifetime of the InProgress object, outside the
> task's call stack.
>
> The desirable behaviour is that when the InProgress object is destroyed,
> if there's an exception attached to it from a failed task that hasn't been
> accessed, we should output the stack as an unhandled exception.  In Kaa, I
> do this with a weakref destroy callback, but this isn't ideal because with
> GC, the InProgress might not be destroyed until well after the exception is
> relevant.
>
> I make every effort to remove reference cycles and generally get the
> InProgress object destroyed as early as possible, but this changes subtly
> between Python versions.
>
> How will unhandled asynchronous exceptions be handled with tulip?
>

That's actually a clever idea: log the exception when the Task object is
destroyed if it hasn't been raised (from result()) or inspected (using
exception()) at least once. I know these have been haunting me in NDB -- it
logs all, some or none of the exceptions depending on the log settings, but
that's not right, and your approach is much better.

So it may come down to implementation cleverness to try and GC Task objects
sooner rather than later -- which will also depend on the Python
implementation. In the end, debugging convenience cannot help but depend on
the implementation.

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20121216/2d609418/attachment.html>