Hi Benoit and folks:
>Message: 3
>Date: Tue, 23 Oct 2012 09:19:59 +0200
>From: Benoit Chesneau <benoitc(a)gunicorn.org>
>To: Guido van Rossum <guido(a)python.org>
>Cc: Python-Ideas <python-ideas(a)python.org>
>Subject: Re: [Python-ideas] yield from multiple iterables (was Re: The
> async API of the future: yield-from)
>Message-ID: <BE74DBDE-1965-47F0-99B9-27F0C7CD574C(a)gunicorn.org>
>Content-Type: text/plain; charset=windows-1252
(I learnt about this mailing list from Christian Tismer's post in the Stackless mailing list and I am catching up)
>I myself toying with the idea of porting the Go concurrency model to Python [4] using greenlets and pyuv. Both the scheduler >and the way IOs are handled:
>- In Go all coroutines are independent from each others and can only communicate via channel. Which has the advantage to >allows them to run on different threads when one is blocking. In normal case they are mostly working like grrenlets on a single >thread and are simply scheduled in a round-robin way. (mostly like in stackless). On the difference that goroutines can be >executed in parallel. When one is blocking another thread will be created to handle other goroutines in the runnable queue.
What aspect of the Go concurrency model? Maybe you already know this but Go and Stackless Python share a common ancestor: Limbo. More specifically the way channels work.
This may be tangential to the discussion but in the past, I have used the stackless.py module in conjunction with CPython and greenlets to rapidly prototype parts of Go's model that are not present in Stackless, i.e. the select (ALT) language feature.
Rob Pike and Russ Cox were really helpful in answering my questions. Newer stackless.py implementations use
continuelets so look for an older PyPy implementation.
I have also prototyped a subset of Polyphonic C# join patterns. After I got the prototype running, I had an interesting discussion with the authors of "Scalable Join Patterns."
For networking support, I run Twisted as a tasklet. There are a few tricks to make Stackless and Twisted co-operate.
Cheers,
Andrew
On Oct 14, 2012 8:42 AM, "Guido van Rossum" <guido(a)python.org> wrote:
> Sadly it looks that
>
> r = yield from (f1(), f2())
>
> ends up interpreting the tuple as the iterator, and you end up with
>
> r = (f1(), f2())
>
> (i.e., a tuple of generators) rather than the desired
>
> r = ((yield from f1()), (yield from f2()))
Didn't want this tangent to get lost to the async discussion. Would it be
too late to make a change along these lines? Would it be enough of an
improvement to be warranted?
-eric
Hi Greg:
Message: 2
Date: Tue, 23 Oct 2012 12:48:39 +1300
From: Greg Ewing <greg.ewing(a)canterbury.ac.nz>
To: "python-ideas(a)python.org" <python-ideas(a)python.org>
Subject: Re: [Python-ideas] yield from multiple iterables (was Re: The
async API of the future: yield-from)
Message-ID: <5085DB57.4010504(a)canterbury.ac.nz>
Content-Type: text/plain; charset=UTF-8; format=flowed
>It does, in the sense that a continuation appears to the
>Scheme programmer as a callable object.
>The connection goes deeper as well. There's a style of
>programming called "continuation-passing style", in which
>nothing ever returns -- every function is passed another
>function to be called with its result. In a language such
>as Scheme that supports tail calls, you can use this style
>extensively without fear of overflowing the call stack.
>You're using this style whenever you chain callbacks
>together using Futures or Deferreds. The callbacks don't
>return values; instead, each callback arranges for another
>callback to be called, passing it the result.
There is a really nice Microsoft Research called "Cooperative Task Management without Manual Stackless Management."[1]
In this paper, the authors introduce the term "stack ripping" to describe how asynchronous events with callbacks handle memory.
I think this is a nice way to describe the fundamental differences between continuations and Twisted callbacks/deferred.
Cheers,
Andrew
[1] http://research.microsoft.com/apps/pubs/default.aspx?id=74219
Greg Ewing <greg.ewing(a)canterbury.ac.nz> wrote:
> Guido van Rossum wrote:
>>
>> I would like people to be able to write fast
>> event handling programs on Windows too, ... But I don't know how
>> tenable that is given the dramatically different style used by IOCP
>> and the need to use native Windows API for all async I/O -- it sounds
>> like we could only do this if the library providing the I/O loop
>> implementation also wrapped all I/O operations, and that may be a bit
>> much.
>
> That's been bothering me, too. It seems like an interface accommodating the completion-based style will have to be *extremely* fat.
No, not really. Quite the opposite, in fact. The way to make the interface thin is to abstract out all the details related to the particulars of the multiplexing I/O underneath everything and the transport functions necessary to read data out of it.
The main interfaces you need are here:
<http://twistedmatrix.com/documents/current/api/twisted.internet.interfaces.…>
<http://twistedmatrix.com/documents/current/api/twisted.internet.interfaces.…>
<http://twistedmatrix.com/documents/current/api/twisted.internet.interfaces.…>
<http://twistedmatrix.com/documents/current/api/twisted.internet.interfaces.…>
which have maybe a dozen methods between them, and could be cleaned up for a standardized version.
The interface required for unifying over completion-based and socket-based is actually much thinner than the interface you get if you start exposing sockets all over the place.
But, focusing on I/O completion versus readiness-notification is, like the triggering modes discussion, missing the forest for the trees. Some of IOCP's triggering modes are itself an interesting example of a pattern, but, by itself, is a bit of a red herring. Another thing you want to abstract over is pipes versus sockets versus files versus UNIX sockets versus UNIX sockets with CMSG extensions versus TLS over TCP versus SCTP versus bluetooth. 99% of applications do not care: a stream of bytes is a stream of bytes and you have to turn it into a stream of some other, higher-layer event protocol.
I would really, really encourage everyone interested in this area of design to go read all of twisted.internet.interfaces and familiarize yourselves with the contents there and make specific comments about those existing interfaces rather than some hypothetical ideal. Also, the Twisted chapter <http://www.aosabook.org/en/twisted.html> in "the architecture of open source applications" explains some of Twisted's architectural decisions. If you're going to re-invent the wheel, it behooves you to at least check whether the existing ones are round. I'm happy to answer questions about specifics of how things are implemented, whether the Twisted APIs have certain limitations, and filling in gaps in the documentation. There are certainly an embarrassing profusion of those, especially in these decade-old, core APIs that haven't changed since we started requiring docstrings; if you find any, please file bugs and I will try to do what I can to get them fixed. But I'd rather not have to keep re-describing the basics.
> That's not just a burden for anyone implementing the interface, it's a problem for any library wanting to *wrap* it as well.
I really have no idea what you mean by this. Writing and wrapping ITransport and IProtocol is pretty straightforward. With the enhanced interfaces I'm working on in <http://tm.tl/1956>, it's almost automatic.
<http://twistedmatrix.com/trac/browser/trunk/twisted/protocols/tls.py>, for example, is a complete bi-directional proxying of all interfaces related to transports (even TCP transport specific APIs, not just the core interfaces above), in addition to implementing all the glue necessary for TLS, with thorough docstrings and comments, all in just over 600 lines. This also easily deals with the fact that, for example, sometimes in order to issue a read-ready notification, TLS needs to write some bytes; and in order to issue a write-ready notification, TLS sometimes needs to read some bytes.
> For example, to maintain separation between the async layer and the generator layer, we will probably want to have an AsyncSocket object in the async layer, and a separate GeneratorSocket in the generator layer that wraps an AsyncSocket.
Yes, generator scheduling and async I/O are really different layers, as I explained in a previous email. This is a good thing as it provides a common basis for developing things in different styles as appropriate to different problem domains. If you smash them together you're just conflating responsibilities and requiring abstraction inversions, not making it easier to implement anything.
> If the AsyncSocket needs to provide methods for all the possible I/O operations that one might want to perform on a socket, then GeneratorSocket needs to provide its own versions of all those methods as well.
GeneratorSocket does not even need to exist in the first implementation of this kind of a spec, let alone provide all possible operations. Python managed to survive without "all the possible I/O operations that one might want to perform on a socket" for well over a decade; sendmsg and recvmsg didn't arrive until 3.3: <http://bugs.python.org/issue6560>.
Plus, GeneratorSocket isn't that hard to write. You just need a method for each I/O operation that returns a Future (by which I mean Deferred, of course :)) and then fires that Future from the relevant I/O operation's callback.
> Multiply that by the number of different kinds of I/O objects (files, sockets, message queues, etc. -- there seem to be quite a lot of them on Windows) and that's a *lot* of stuff to be wrapped.
The common operations here are by far the most important. But, yes, if you want to have support for all the wacky things that Windows provides, you have to write wrappers for all the wacky things you need to call.
>> Finally, there should also be some minimal interface so that multiple I/O loops can interact -- at least in the case where one I/O loop belongs to a GUI library.
>
> That's another thing that worries me. With a ready-based event loop, this is fairly straightforward. If you can get hold of the file descriptor or handle that the GUI is ultimately reading its input from, all you need to do is add it as an event source to your main loop, and when it's ready, tell the GUI event loop to run itself once.
No. That is how X windows and ncurses work, not how GUIs in general work.
On Windows, the GUI is a message pump on a thread (and possibly a collection thereof); there's no discrete event which represents it and no completion port or event object that gets its I/O, but at the low level, you're still expected to write your own loop and call something that blocks waiting for GUI input. (This actually causes some problems, see below.)
On Mac OS X, the GUI is an event loop of its own; you have to integrate with CFRunLoop via CFRunLoopRun (or something that eventually calls it, like NSApplicationMain), not write your own loop that calls a blocking function. You don't get to invent your own thing with kqueue or select() and then explicitly observe "the GUI" as some individual discrete event; there's nothing to read, the GUI just calls directly into your application. Underneath there's some mach messages and stuff, but I honestly couldn't tell you how that all works; it's not necessary to understand. (And in fact "the GUI" is not actually just the GUI, but a whole host of notifications from other processes, the display, the sound device, and so on, that you can register for. The documentation for NSNotificationCenter is illuminating.)
(I don't know anything about Android. Can anyone comment authoritatively about that?)
This really doesn't have anything to do with the readiness-based-ness of the API, but rather that there is more on heaven and earth (and kernel interrupt handlers) than is dreamt of in your philosophy (and file descriptor dispatch functions).
Once again: the important thing is to separate out these fiddly low layers for each platform and get something that exposes the high layer that most python programmers care about - "incoming connection", "here are some bytes", "your connection was dropped" - in such a way that you can plug in an implementation that uses it to any one of these low-level things.
> But you can't do that with a completion-based main loop, because the actual reading of the input needs to be done in a different way, and that's usually buried somewhere deep in the GUI library where you can't easily change it.
Indeed not, but this phrasing makes it sound like "completion-based" main loops are some weird obscure thing. This is not an edge-case problem you can sweep under the rug with the assumption that somebody will be able to wrestle a file descriptor out of the GUI somehow or emulate it eventually. The GUI platforms that basically everyone in the world uses don't observe file descriptors for their input events.
>> It seems this is a solved problem (as well solved as you can hope for) to Twisted, so we should just adopt their
>> approach.
>
> Do they actually do it for an IOCP-based main loop on Windows?
No, but it's hypothetically possible.
For GUIs, we have win32eventreactor, which can't support as many sockets, but runs the message pump, which causes the GUI to run (for most GUI toolkits). Several low-level Windows applications have used this to good effect. (Although I don't know of any that are open source, unfortunately.)
There's also the fact that most people writing Python GUIs want to use a cross-platform library, so most of the demand for GUI sockets on Windows have been for integrating with Wx, Qt, or GTK, and we have support for all of those separately from the IOCP stuff. It's usually possible to call the wrapped socket functions in those libraries, but more difficult to reach below the GUI library and dispatch to it one windows message pump message at a time.
> If so, I'd be interested to know how.
It's definitely possible to get a GUI to cooperate nicely with IOCP, but it's a bit challenging to figure out how. I had a very long, unpleasant conversation with the IOCP reactor's maintainer while we refreshed our memories about the frankly sadistic IOCP API, and put together all of our various experiences working with it, trying to refresh our collective memory to the point where we remembered enough about the way IOCP actually works to be able to explain it, so I hope you enjoy this :-).
Right now Twisted's IOCP reactor uses the mode of IOCP where it passes NULL to both the lpCompletionRoutine and lpOverlapped->hEvent member of everything (i.e. WSARecv, WSASend, WSAAccept, etc). Later, the reactor thread blocks on GetQueuedCompletionStatus, which only blocks on the associated completion port's scheduled I/O, which precludes noticing when messages arrive from the message pump.
As I mentioned above, the message pump is a discrete source of events and can't be waited upon as a C runtime "file descriptor", WSA socket, IOCP completion or thread event. Also, you can't translate it into one of those sources, because the message pump is associated with a particular thread; you can't call a function in a different thread to call PostQueuedCompletionStatus.
There are two ways to fix this; there already is a lengthy and confusing digression in comments in the implementation explaining parts of this.
The first, and probably easiest option, is simply to create an event with CreateEvent(bManualReset=False) and fill out the hEvent structure of all queued Event objects with that same event, pass that event handle to MsgWaitForMultipleObjectsEx. Then, if the message queue wakes up the thread, you dispatch messages the standard way (doing what win32eventreactor already does: see win32gui.PumpWaitingMessages). If instead, the event signals, you call GetQueuedCompletionStatus as IOCP already does, and it will always return immediately.
The second (and probably higher performance) option is to fill out the lpCompletionRoutine parameter to all I/O functions, and effectively have the reactor's "loop" integrated into the implicit asynchronous procedure dispatch of any alertable function. This would have to be MsgWaitForMultipleObjectsEx in order to wait on events added with addEvent(), in the reactor's core. The reactor's core itself could actually just call WaitForSingleObjectEx() and it would be roughly the same except for those external events, as long as the thread is put into an alertable state. This option is likely higher performance because it removes all the function call and iteration overhead because you effectively go straight from the kernel to the I/O handling function. In addition to being slightly trickier though, there's also the fact that someone else might put the thread into an alertable state and the I/O completion might be done with a surprising stack frame.
If you want to integrate this with a modern .NET application (i.e. windows platform-specific stuff), I think this is the relevant document: <http://msdn.microsoft.com/en-us/library/aa348549.aspx>; I am not sure how you'd integrate it with Wx/Tk/Qt/GTK+.
-glyph
Clarification:
I have a tendency to mention constructs
from other threads in a discussion.
This might suggest that I'm propose
using this not-yet-included or even
accepted feature as a solution. For instance
Guido's reaction to my last message
might be an indicator of misinterpreting
this, although I'm not sure if I was
Prinarily addressed at all (despite that the to/cc
suggested it).
Anyway, I just want to make sure:
If I'm mentioning stackless or codef
or greenlet, this does not imply that I
propose to code the solution to async
by implementing such a thing, first.
The opposite is true.
I mean such meantioning more like
a macro-like feature:
I'm implementing structures using the existing
things, but adhere to a coding style
that stays compatible to one of the mentioned
principles.
This is like a macro feature of my brain
- I talk about codef, but code it using
yield-from.
So please don't take me wrong that I
want to push for features to be
included. This is only virtual. I use yield
constructs, but obey the codef protocol,
for instance.
And as an addition: when I'm talking
of generators implemented by yield from,
then this is just a generator that can
yield from any of its sub-functions.
I am not talking about tasks or schedulars.
These constructs do not belong there.
I'm strongly against using "yield from"
for this.
It is a building block for generatos
resp. coroutines, and there it stops !
Higher level stuff should by no means
use those primitives at all.
Sent from my Ei4Steve
On Fri, Oct 19, 2012 at 10:44 PM, Greg Ewing
<greg.ewing(a)canterbury.ac.nz> wrote:
> If I wrote a library intended for serious use, the end user
> probably wouldn't write either of those. Instead he would
> write something like
>
> yield from block(self.queue)
>
> and it would be an implementation detail of the library
> where abouts the 'yield' happened and whether it needed
> to send a value or not.
What's the benefit of having both "yield" and "yield from" as opposed
to just "yield"? It seems like an attractive nuisance if "yield" works
but doesn't let the function have implementation details and wait for
more than one thing or somesuch.
With the existing generator-coroutine decorators (monocle,
inlineCallbacks), there is no such trap. "yield foo()" will work no
matter how many things foo() will wait for.
My understanding is that the only benefit we get here is nicer
tracebacks. I hope there's more.
-- Devin
Greg Ewing wrote:
> Mark Shannon wrote:
>
>> Why not have proper co-routines, instead of hacked-up generators?
>
> What do you mean by a "proper coroutine"?
>
A parallel, non-concurrent, thread of execution.
It should be able to transfer control from arbitrary places in
execution, not within generators.
Stackless provides coroutines. Greenlets are also coroutines (I think).
Lua has them, and is implemented in ANSI C, so it can be done portably.
See: http://www.jucs.org/jucs_10_7/coroutines_in_lua/de_moura_a_l.pdf
(One of the examples in the paper uses coroutines to implement
generators, which is obviously not required in Python :) )
Cheers,
Mark.
So Guido told it's better to discuss things here.
Mostly reiterating what I said in the G+ thread. I'm by no means a
greybeard in library/language design, and have no successful async
project behind me, so please take what I'm saying with a grain of
salt. I just want to highlight a point I feel is very important.
There should be standard library, but no standard framework. Please.
You see, there is a number of event-driven frameworks, and they all
suck. Sorry for being blunt, but each one of them is more or less
voluntarily described as being almost the ultimate silver bullet, or
even a silver grenade, the One Framework to rule them all. The truth
is that every framework that prospers today was started as a scratch
to a specific itch, and it might be a perfect scratch for that class
of itches. I know of no application framework designed as being the
ultimate scratch for every itch that is not dead and forgotten, or
described on a resource other than thedailywtf.
There is a reason for this state of things, mainly that the real world
is a rather complex pile of crap, and there is no nice uniform model
into which you can fit all of that crap and expect the model still to
be good for any practical use. Thus in the world of software, which is
notoriously complex subset of the crap the real world is, we are going
to live with dozens of event models, asynchronous I/O models, gobs of
event loops. Every one of them (even WaitForMultipleObjects() kind of
loop) is a priceless tool for a specific class of problem it's
designed to solve, so it won't go away, ever.
The standard library, on the other hand, IS the ultimate tool. It is
the way things should work. The people look at it as the reference,
the code written the way it should be, encompassing the best of the
best practices out there. Everyone keeps telling, just look at how
this thing is implemented. Look, it's in the stdlib, don't reinvent
the wheel. It illustrates the Right Way to use the language and the
runtime, the ultimate argument to end doubts.
In my opinion, the reason a standard library can be regarded this high
is exactly because it provides high-quality examples (or at least it
should do that), materials, bits and tools, but does not limit you in
the way those tools can be used, and does not impose its rules on you
if you want to actually roll something of your own. No framework in
the world should have this power, as it would defeat the very reason
frameworks do exist.
And that's why I think, while asyncore and other expired batteries
need to be cleaned up and upgraded (so they are of any use), I expect
that no existing frameworks would enter the stdlib as de jure
standard. I would expect instead that there would be useful primitives
each of these frameworks implements anyway, and make the standard
networking modules aware of those.
But please, no bringing $MYFAVORITEFRAMEWORK into stdlib. You will
either end up with something horrendous to support every existing
mainloop implemetation out there, unwieldy and buggy, or you will make
a clean, elegant framework that won't solve anyone's problem, will be
incompatible with the rest of the world and fall into disuse. Or you
can bring some gevent, or Tornado, you name it, into stdlib, and make
the users of the remaining dozens of frameworks feel like damned
outcasts.
I feel the same about web things. Picking the tools to parse HTTP
requests and forming the responses is okay, as HTTP is not a simple
thing; bringing into the standard library the templating engine,
routing engine, or, God forbid, an ORM, would be totally insane.
[This is the second spin-off thread from "asyncore: included batteries
don't fit"]
On Thu, Oct 11, 2012 at 6:32 PM, Greg Ewing <greg.ewing(a)canterbury.ac.nz> wrote:
> Guido van Rossum wrote:
>> It does bother me somehow that you're not using .send() and yield
>> arguments at all. I notice that you have a lot of three-line code
>> blocks like this:
>>
>> block_for_reading(sock)
>> yield
>> data = sock.recv(1024)
> I wouldn't say I have a "lot". In the spamserver, there are really
> only three -- one for accepting a connection, one for reading from
> a socket, and one for writing to a socket. These are primitive
> operations that would be provided by an async socket library.
Hm. In such a small sample program, three near-identical blocks is a lot!
> Generally, all the yields would be hidden inside primitives like
> this. Normally, user code would never need to use 'yield', only
> 'yield from'.
>
> This probably didn't come through as clearly as it might have in my
> tutorial. Part of the reason is that at the time I wrote it, I was
> having to manually expand yield-froms into for-loops, so I was
> reluctant to use any more of them than I needed to. Also, yield-from
> was a new and unfamiliar concept, and I didn't want to scare people
> by overusing it. These considerations led me to push some of the
> yields slightly further up the layer stack than they could be.
But the fact remains that you can't completely hide these yields --
the best you can do is replace them with a single yield-from.
>> The general form seems to be:
>>
>> arrange for a callback when some operation can be done without blocking
>> yield
>> do the operation
>>
>> This seems to be begging to be collapsed into a single line, e.g.
>>
>> data = yield sock.recv_async(1024)
> I'm not sure how you're imagining that would work, but whatever
> it is, it's wrong -- that just doesn't make sense.
That's a strong statement! It makes a lot of sense in a world using
Futures and a Future-aware trampoline/scheduler, instead of yield-from
and bare generators. I can see however that you don't like it in the
yield-from world you're envisioning, and how it would be confusing
there. I'll get back to this in a bit.
> What *would* make sense is
>
> data = yield from sock.recv_async(1024)
>
> with sock.recv_async() being a primitive that encapsulates the
> block/yield/process triplet.
Right, that's how you would spell it.
>> (I would also prefer to see the socket wrapped in an object that makes
>> it hard to accidentally block.)
> It would be straightforward to make the primitives be methods of a
> socket wrapper object. I only used functions in the tutorial in the
> interests of keeping the amount of machinery to a bare minimum.
Understood.
>> But surely there's still a place for send() and other PEP 342 features?
> In the wider world of generator usage, yes. If you have a
> generator that it makes sense to send() things into, for
> example, and you want to factor part of it out into another
> function, the fact that yield-from passes through sent values
> is useful.
But the only use for send() on a generator is when using it as a
coroutine for a concurrent tasks system -- send() really makes no
sense for generators used as iterators. And you're claiming, it seems,
that you prefer yield-from for concurrent tasks.
> But we're talking about a very specialised use of generators
> here, and so far I haven't thought of a use for sent or yielded
> values in this context that can't be done in a more straightforward
> way by other means.
>
> Keep in mind that a value yielded by a generator being used as
> part of a coroutine is *not* seen by code calling it with
> yield-from. Rather, it comes out in the inner loop of the
> scheduler, from the next() call being used to resume the
> coroutine. Likewise, any send() call would have to be made
> by the scheduler, not the yield-from caller.
I'm very much aware of that. There is a *huge* difference between
yield-from and yield.
However, now that I've implemented a substantial library (NDB, which
has thousands of users in the App Engine world, if not hundreds of
thousands), I feel that "value = yield <something that returns a
Future>" is quite a good paradigm, and the only part of PEP 380 I'm
really looking forward to embracing (once App Engine supports Python
3.3) is the option to return a value from a generator -- which my
users currently have to spell as "raise ndb.Return(<value>)".
> So, the send/yield channel is exclusively for communication
> with the *scheduler* and nothing else. Under the old way of
> doing generator-based coroutines, this channel was used to
> simulate a call stack by yielding 'call' and 'return'
> instructions that the scheduler interpreted. But all that
> is now taken care of by the yield-from mechanism, and there
> is nothing left for the send/yield channel to do.
I understand that's the state of the world that you're looking forward
to. However I'm slightly worried that in practice there are some
issues to be resolved. One is what to do with operations directly
implemented in C. It would be horrible to require C to create a fake
generator. It would be mildly nasty to have to wrap these all in
Python code just so you can use them with yield-from. Fortunately an
iterator whose final __next__() raises StopIteration(<value>) works in
the latest Python 3.3 (it didn't work in some of the betas IIRC).
>> my users sometimes want to
>> treat something as a coroutine but they don't have any yields in it
>>
>> def caller():
>> data = yield from reader()
>>
>> def reader():
>> return 'dummy'
>> yield
>>
>> works, but if you drop the yield it doesn't work. With a decorator I
>> know how to make it work either way.
> If you're talking about a decorator that turns a function
> into a generator, I can't see anything particularly headachish
> about that. If you mean something else, you'll have to elaborate.
Well, I'm talking about a decorator that you *always* apply, and which
does nothing (or very little) when wrapping a generator, but adds
generator behavior when wrapping a non-generator function.
Anyway, I am trying to come up with a table comparing Futures and your
yield-from-using generators. I'm basing this on a subset of the PEP
3148 API, and I'm not presuming threads -- I'm just looking at the
functionality around getting and setting callbacks, results, and
exceptions. My reference is actually based on NDB, but the API there
differs from PEP 3148 in uninteresting ways, so I'll use the PEP 3148
method names.
(1) Calling an async operation and waiting for its result, using yield
Futures:
result = yield some_async_op(args)
Yield-from:
result = yield from some_async_op(args)
(2) Setting the result of an async operation
Futures:
f.set_result(value) # From any callback
Yield-from:
return value # From the outermost generator
(3) Handling an exception
Futures:
try:
result = yield some_async_op(args)
except MyException:
<handle exception>
Yield-from:
try:
result = yield from some_async_op(args)
except MyException:
<handle exception>
Note: with yield-from, the tracebacks for unhandled exceptions are
possibly prettier.
(4) Raising an exception as the outcome of an async operation
Futures:
f.set_exception(<Exception instance>)
Yield-from:
raise <Exception instance or class> # From any of the generators
Note: with Futures, the traceback also needs to be stored; in Python 3
it is stored on the Exception instance's __traceback__ attribute. But
when letting exceptions bubble through multiple levels of nested
calls, you must do something special to ensure the traceback looks
right to the end user.
(5) Having one async operation invoke another async operation
Futures:
@task
def outer(args):
res = yield inner(args)
return res
Yield-from:
def outer(args):
res = yield from inner(args)
return res
Note: I'm including this because in the Futures case, each level of
yield requires the creation of a separate Future. In practice this
requires decorating all async functions. And also as a lead-in to the
next item.
(6) Spawning off multiple async subtasks
Futures:
f1 = subtask1(args1) # Note: no yield!!!
f2 = subtask2(args2)
res1, res2 = yield f1, f2
Yield-from:
??????????
*** Greg, can you come up with a good idiom to spell concurrency at
this level? Your example only has concurrency in the philosophers
example, but it appears to interact directly with the scheduler, and
the philosophers don't return values. ***
(7) Checking whether an operation is already complete
Futures:
if f.done(): ...
Yield-from:
?????????????
(8) Getting the result of an operation multiple times
Futures:
f = async_op(args)
# squirrel away a reference to f somewhere else
r = yield f
# ... later, elsewhere
r = f.result()
Yield-from:
???????????????
(9) Canceling an operation
Futures:
f.cancel()
Yield-from:
???????????????
Note: I haven't needed canceling yet, and I believe Devin said that
Twisted just got rid of it. However some of the JS Deferred
implementations seem to support it.
(10) Registering additional callbacks
Futures:
f.add_done_callback(callback)
Yield-from:
???????
Note: this is used in NDB to trigger "hooks" that should run e.g. when
a database write completes. The user's code just writes yield
ent.put_async(); the trigger is automatically called by the Future's
machinery. This also uses (8).
--
--Guido van Rossum (python.org/~guido)