<html>
<head>
<meta content="text/html; charset=ISO-8859-1"
http-equiv="Content-Type">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<div class="moz-cite-prefix">On 12-12-16 11:27 AM, Guido van Rossum
wrote:<br>
</div>
<blockquote
cite="mid:CAP7+vJLAPT46qzAa5g7iFVdkDvr-QxDUKC1Nnee2QuH3s-=8qA@mail.gmail.com"
type="cite">
<div>The PEP is definitely weak. Here are some thoughts/proposals
though:</div>
<div>
<ul>
<li>You can't cancel a coroutine; however you can cancel a
Task, which is a Future wrapping a stack of coroutines
linked via yield-from.</li>
</ul>
</div>
</blockquote>
<br>
I'll just underline your statement that "you can't cancel a
coroutine" here, since I'm referencing it later.<br>
<br>
This distinction between "bare" coroutines, Futures, and Tasks is a
bit foreign to me, since in Kaa all coroutines return (a subclass
of) InProgress objects.<br>
<br>
The Tasks section in the PEP says that a bare coroutine (is this the
same as the previously defined "coroutine object"?) has much less
overhead than a Task but it's not clear to me why that would be, as
both would ultimately need to be managed by the scheduler, wouldn't
they?<br>
<br>
I could imagine that a coroutine object is implemented as a C object
for performance, and a Task is a Python class, and maybe that
explains the difference. But then why differentiate between Future
and Task (particularly because they have the same interface, so I
can't draw an analogy with jQuery's Deferreds and Promises, where
Promises are a restricted form of Deferreds for public consumption
to attach callbacks).<br>
<br>
<br>
<blockquote
cite="mid:CAP7+vJLAPT46qzAa5g7iFVdkDvr-QxDUKC1Nnee2QuH3s-=8qA@mail.gmail.com"
type="cite">
<div>
<ul>
<li>Cancellation only takes effect when a task is suspended.</li>
</ul>
</div>
</blockquote>
<br>
Yes, this is intuitive.<br>
<br>
<br>
<blockquote
cite="mid:CAP7+vJLAPT46qzAa5g7iFVdkDvr-QxDUKC1Nnee2QuH3s-=8qA@mail.gmail.com"
type="cite">
<div>
<ul>
<li>When you cancel a Task, the most deeply nested coroutine
(the one that caused it to be suspended) receives a special
exception (I propose to reuse
concurrent.futures.CancelledError from PEP 3148). If it
doesn't catch this it bubbles all the way to the Task, and
then out from there.</li>
</ul>
</div>
</blockquote>
<br>
So if the most deeply nested coroutine catches the CancelledError
and doesn't reraise, it can prevent its cancellation?<br>
<br>
I took a similar appoach, except that coroutines can't abort their
own cancellation, and whether or not the nested coroutines actually
get cancelled depends on whether something else was interested in
their result.<br>
<br>
Consider a coroutine chain where A yields B yields C yields D, and
we do B.abort()<br>
<ul>
<li>if only C was interested in D's result, then D will get an
InProgressAborted raised inside it (at whatever point it's
currently suspended). If something other than C was also
waiting on D, D will not be affected<br>
</li>
<li>similarly, if only B was interested in C's result, then C will
get an InProgressAborted raised inside it (at yield D).<br>
</li>
<li>B will get InProgressAborted raised inside it (at yield C)<br>
</li>
<li>for B, C and D, the coroutines will not be reentered and they
are not allowed to yield a value that suggests they expect
reentry. There's nothing a coroutine can do to prevent its own
demise.</li>
<li>A will get an InProgressAborted raised inside it (at yield B)</li>
<li>In all the above cases, the InProgressAborted instance has an
origin attribute that is B's InProgress object</li>
<li>Although B, C, and D are now aborted, A isn't aborted. It's
allowed to yield again.</li>
<li>with Kaa, coroutines are abortable by default (so they are
like Tasks always). But in this example, B can present C from
being aborted by yielding C().noabort()</li>
</ul>
<br>
There are quite a few scenarios to consider: A yields B and B is
cancelled or raises; A yields B and A is cancelled or raises; A
yields B, C yields B, and A is cancelled or raises; A yields B, C
yields B, and A or C is cancelled or raises; A yields par(B,C,D) and
B is cancelled or raises; etc, etc.<br>
<br>
In my experience, there's no one-size-fits-all behaviour, and the
best we can do is have sensible default behaviour with some API
(different functions, kwargs, etc.) to control the cancellation
propagation logic.<br>
<br>
<br>
<blockquote
cite="mid:CAP7+vJLAPT46qzAa5g7iFVdkDvr-QxDUKC1Nnee2QuH3s-=8qA@mail.gmail.com"
type="cite">
<div>
<ul>
<li>However when a coroutine in one Task uses yield-from to
wait for another Task, the latter does not automatically get
cancelled. So this is a difference between "yield from
foo()" and "yield from Task(foo())", which otherwise behave
pretty similarly. Of course the first Task could catch the
exception and cancel the second task -- that is its
responsibility though and not the default behavior.</li>
</ul>
</div>
</blockquote>
<br>
Ok, so nested bare coroutines will get cancelled implicitly, but
nested Tasks won't?<br>
<br>
I'm having a bit of difficulty with this one. You said that
coroutines can't be cancelled, but Tasks can be. But here, if they
are being yielded, the opposite behaviour applies: yielded
coroutines <i>are</i> cancelled if a Task is cancelled, but yielded
tasks <i>aren't</i>.<br>
<br>
Or have I misunderstood?<br>
<br>
<br>
<blockquote
cite="mid:CAP7+vJLAPT46qzAa5g7iFVdkDvr-QxDUKC1Nnee2QuH3s-=8qA@mail.gmail.com"
type="cite">
<div>
<ul>
<li>PEP 3156 has a par() helper which lets you block for
multiple tasks/coroutines in parallel. It takes arguments
which are either coroutines, Tasks, or other Futures; it
wraps the coroutines in Tasks to run them independently an
just waits for the other arguments. Proposal: when the Task
containing the par() call is cancelled, the par() call
intercepts the cancellation and by default cancels those
coroutines that were passed in "bare" but not the arguments
that were passed in as Tasks or Futures. Some keyword
argument to par() may be used to change this behavior to
"cancel none" or "cancel all" (exact API spec TBD).</li>
</ul>
</div>
</blockquote>
<br>
Here again, par() would cancel a bare coroutine but not Tasks. It's
consistent with your previous bullet but seems to contradict your
first bullet that you can't cancel a coroutine.<br>
<br>
I guess the distinction is you can't explicitly cancel a coroutine,
but coroutines can be implicitly cancelled?<br>
<br>
As I discussed previously, one of those tasks might be yielded by
some other active coroutine, and so cancelling it may not be the
right thing to do. Being able to control this behaviour is
important, whether that's a par() kwarg, or special method like
noabort() that constructs an unabortable Task instance.<br>
<br>
Kaa has similar constructs to allow yielding a collection of
InProgress objects (whatever they might represent: coroutines,
threaded functions, etc.). In particular, it allows you to yield
multiple tasks and resume when ALL of them complete (InProgressAll),
or when ANY of them complete (InProgressAny). For example:<br>
<br>
<pre> @kaa.coroutine()
def is_any_host_up(*hosts):
try:
# ping() is a coroutine
yield kaa.InProgressAny(ping(host) for host in hosts).timeout(5, abort=True)
except kaa.TimeoutException:
yield False
else:
yield True
</pre>
<br>
More details here:<br>
<br>
<a class="moz-txt-link-freetext" href="http://api.freevo.org/kaa-base/async/inprogress.html#inprogress-collections">http://api.freevo.org/kaa-base/async/inprogress.html#inprogress-collections</a><br>
<br>
From what I understand of the proposed par() it would require<i> </i>ALL
of the supplied futures to complete, but there are many use-cases
for the ANY variant as well.<br>
<br>
<br>
<blockquote
cite="mid:CAP7+vJLAPT46qzAa5g7iFVdkDvr-QxDUKC1Nnee2QuH3s-=8qA@mail.gmail.com"
type="cite">
<div>Interesting. In Tulip v1 (the experimental version I wrote
before PEP 3156) the Task() constructor has an optional timeout
argument. It works by scheduling a callback at the given time in
the future, and the callback simply cancel the task (which is a
no-op if the task has already completed). It works okay, except
it generates tracebacks that are sometimes logged and sometimes
not properly caught -- though some of that may be my messy test
code. The exception raised by a timeout is the same
CancelledError, which is somewhat confusing. I wonder if
Task.cancel() shouldn't take an exception with which to cancel
the task with. (TimeoutError in PEP 3148 has a different role,
it is when the timeout on a specific wait expires, so e.g.
fut.result(timeout=2) waits up to 2 seconds for fut to complete,
and if not, the call raises TimeoutError, but the code running
in the executor is unaffected.)</div>
</blockquote>
<br>
FWIW, the equivalent in Kaa which is InProgress.abort() does take an
optional exception, which must subclass InProgressAborted. If None,
a new InProgressAborted is created. InProgress.timeout(t) will
start a timer that invokes InProgress.abort(TimeoutException())
(TimeoutException subclasses InProgressAborted).<br>
<br>
It sounds like your proposed implementation works like:<br>
<br>
<pre> @tulip.coroutine()
def foo():
try:
result = yield from Task(othercoroutine()).result(timeout=2)
except TimeoutError:
# ... othercoroutine() still lives on
</pre>
I think Kaa's syntax is cleaner but it seems functionally the same:<br>
<br>
<pre> @kaa.coroutine()
def foo():
try:
result = yield othercoroutine().timeout(2)
except kaa.TimeoutException:
# ... othercoroutine() still lives on
</pre>
It's also possible to conveniently ensure that othercoroutine() is
aborted if the timeout elapses:<br>
<br>
<pre> try:
result = yield othercoroutine().timeout(2, abort=True)
except kaa.TimeoutException:
# ... othercoroutine() is aborted
</pre>
<br>
<blockquote
cite="mid:CAP7+vJLAPT46qzAa5g7iFVdkDvr-QxDUKC1Nnee2QuH3s-=8qA@mail.gmail.com"
type="cite">We've had long discussions about yield vs. yield-from.
The latter is way more efficient and that's enough for me to push
it through. When using yield, each yield causes you to bounce to
the scheduler, which has to do a lot of work to decide what to do
next, even if that is just resuming the suspended generator; and
the scheduler is responsible for keeping track of the stack of
generators. When using yield-from, calling another coroutine as a
subroutine is almost free and doesn't involve the scheduler at
all; thus it's much cheaper, and the scheduler can be simpler
(doesn't need to keep track of the stack). Also stack traces and
debugging are better.
</blockquote>
<br>
But this sounds like a consequence of a particular implementation,
isn't it?<br>
<br>
A @kaa.coroutine() decorated function is entered right away when
invoked, and the decorator logic does as much as it can until the
underlying generator yields an unfinished InProgress that needs to
wait for (or kaa.NotFinished). Once it yields, <i>then</i> the
decorator sets up the necessary hooks with the scheduler / event
loop.<br>
<br>
This means you can nest a stack of coroutines without involving the
scheduler until something truly asynchronous needs to take place.<br>
<br>
Have I misunderstood?<br>
<br>
<br>
<blockquote
cite="mid:CAP7+vJLAPT46qzAa5g7iFVdkDvr-QxDUKC1Nnee2QuH3s-=8qA@mail.gmail.com"
type="cite">
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div text="#000000" bgcolor="#FFFFFF">
<ul>
<li>coroutines can have certain policies that control
invocation behaviour. The most obvious ones to describe
are POLICY_SYNCHRONIZED which ensures that multiple
invocations of the same coroutine are serialized, and
POLICY_SINGLETON which effectively ignores subsequent
invocations if it's already running</li>
<li>it is possible to have a special progress object passed
into the coroutine function so that the coroutine's
progress can be communicated to an outside observer</li>
</ul>
</div>
</blockquote>
<div><br>
</div>
<div>These seem pretty esoteric and can probably implemented in
user code if needed.</div>
</blockquote>
<br>
I'm fine with that, provided the flexibility is there to allow for
it.<br>
<br>
<br>
<blockquote
cite="mid:CAP7+vJLAPT46qzAa5g7iFVdkDvr-QxDUKC1Nnee2QuH3s-=8qA@mail.gmail.com"
type="cite">
<div>As I said, I think wait_for_future() and run_in_executor() in
the PEP give you all you need. The @threaded decorator you
propose is just sugar; if a user wants to take an existing API
and convert it from a coroutine to threaded without requiring
changes to the caller, they can just introduce a helper that is
run in a thread with run_in_executor().</div>
</blockquote>
<br>
Also works for me. :)<br>
<br>
<br>
<blockquote
cite="mid:CAP7+vJLAPT46qzAa5g7iFVdkDvr-QxDUKC1Nnee2QuH3s-=8qA@mail.gmail.com"
type="cite">
<div>Thanks for your very useful contribution! Kaa looks like an
interesting system. Is it ported to Python 3 yet? Maybe you
could look into integrating with the PEP 3156 event loop and/or
scheduler.</div>
</blockquote>
<br>
Kaa does work with Python 3, yes, although it still lacks very much
needed unit tests so I'm not completely confident it has the same
functional coverage as Python 2.<br>
<br>
I'm definitely interested in having it conform to whatever shakes
out of PEP 3156, which is why I'm speaking up now. :)<br>
<br>
<br>
I've a couple other subjects I should bring up:<br>
<br>
Tasks/Futures as "signals": it's often necessary to be able to
resume a coroutine based on some condition other than e.g. any IO
tasks it's waiting on. For example, in one application, I have a
(POLICY_SINGLETON) coroutine that works off a download queue. If
there's nothing in the queue, it's suspended at a yield. It's the
coroutine equivalent of a dedicated thread. [1]<br>
<br>
It must be possible to "wake" the queue manager when I enqueue a job
for it. Kaa has this notion of "signals" which is similar to the
gtk+ style of signals in that you can attach callbacks to them and
emit them. Signals can be represented as InProgress objects, which
means they can be yielded from coroutines and used in
InProgressAny/All objects.<br>
<br>
So my download manager coroutine can yield an InProgressAny of all
the active download coroutines <i>and</i> the "new job enqueued"
signal, and execution will resume as long as any of those conditions
are met.<br>
<br>
Is there anything in your current proposal that would allow for this
use-case?<br>
<br>
[1]
<a class="moz-txt-link-freetext" href="https://github.com/jtackaberry/stagehand/blob/master/src/manager.py#L390">https://github.com/jtackaberry/stagehand/blob/master/src/manager.py#L390</a><br>
<br>
<br>
Another pain point for me has been this notion of unhandled
asynchronous exceptions. Asynchronous tasks are represented as an
InProgress object, and if a task fails, accessing InProgress.result
will raise the exception at which point it's considered handled.
This attribute access could happen at any time during the lifetime
of the InProgress object, outside the task's call stack.<br>
<br>
The desirable behaviour is that when the InProgress object is
destroyed, if there's an exception attached to it from a failed task
that hasn't been accessed, we should output the stack as an
unhandled exception. In Kaa, I do this with a weakref destroy
callback, but this isn't ideal because with GC, the InProgress might
not be destroyed until well after the exception is relevant.<br>
<br>
I make every effort to remove reference cycles and generally get the
InProgress object destroyed as early as possible, but this changes
subtly between Python versions.<br>
<br>
How will unhandled asynchronous exceptions be handled with tulip?<br>
<br>
Thanks!<br>
Jason.<br>
</body>
</html>