[This is a lengthy mail; I apologize in advance!] Hi, I've been following this discussion with great interest, and would like to put forward a suggestion that might simplify some of the questions that are up in the air. There are several key point being considered: what exactly constitutes a "coroutine" or "tasklet", what the precise semantics of "yield" and "yield from" should be, how the stdlib can support different event loops and reactors, and how exactly Futures, Deferreds, and other APIs fit into the whole picture. This mail is mostly about the first point: I think everyone agrees roughly what a coroutine-style generator is, but there's enough variation in how they are used, both historically and presently, that the concept isn't as precise as it should be. This makes them hard to think and reason about (failing the "BDFL gets headaches" test), and makes it harder to define the behavior of all the parts that they interact with, too. This is a sketch of an attempt to define what constitutes a generator-based task or coroutine more rigorously: I think that the essential behavior can be captured in a small protocol, building on the generator and iterator protocols. If anyone else thinks this is a good idea, maybe something like this could work its way into a PEP? (For the sake of this mail, I will use the term "generator task" or "task" as a straw man term, but feel free to substitute "coroutine", or whatever the preferred name ends up being.) Definition ========== Very informally: A "generator task" is what you get if you take a normal Python function and replace its blocking calls with "yield from" calls to equivalent subtasks. More formally, a "generator task" is a generator that implements an incremental, multi-step computation, and is intended to be externally driven to completion by a runner, or "scheduler", until it delivers a final result. This driving process happens as follows: 1. A generator task is iterated by its scheduler to yield a series of intermediate "step" values. 2. Each value yielded as a "step" represents a scheduling instruction, or primitive, to be interpreted by the task's scheduler. This scheduling instruction can be None ("just resume this task later"), or a variety of other primitives, such as Futures ("resume this task with the result of this Future"); see below for more. 3. The scheduler is responsible for interpreting each "step" instruction as appropriate, and sending the instruction's result, if any, back to the task using send() or throw(). A scheduler may run a single task to completion, or may multiplex execution between many tasks: generator tasks should assume that other tasks may have executed while the task was yielding. 4. The generator task completes by successfully returning (raising StopIteration), or by raising an exception. The task's caller receives this result. (For the sake of discussion, I use "the scheduler" to refer to whoever calls the generator task's next/send/throw methods, and "the task's caller" to refer to whoever receives the task's final result, but this is not important to the protocol: a task should not care who drives it or consumes its result, just like an iterator should not.) Scheduling instructions / primitives ==================================== (This could probably use a better name.) The protocol is intentionally agnostic about the implementation of schedulers, event loops, or reactors: as long as they implement the same set of scheduling primitives, code should work across them. There multiple ways to accomplish this, but one possibility is to have a set common, generic instructions in a standard library module such as "tasklib" (which could also contain things like default scheduler implementations, helper functions, and so on). A partial list of possible primitives (the names are all made up, not serious suggestions): 1. None: The most basic "do nothing" instruction. This just instructs the scheduler to resume the yielding task later. 2. Futures: Instruct the scheduler to resume with the future's result. Similar types in third-party libraries, such Deferreds, could potentially be implemented either natively by a scheduler that supports it, or using a wait_for_deferred(d) helper task, or using the idea of a "adapter" scheduler (see below). 3. Control primitives: spawn, sleep, etc. - Spawn a new (independent) task: yield tasklib.spawn(task()) - Wait for multiple tasks: (x, y) = yield tasklib.par(foo(), bar()) - Delay execution: yield tasklib.sleep(seconds) - etc. These could be simple marker objects, leaving it up to the underlying scheduler to actually recognize and implement them; some could also be implemented in terms of simpler operations (e.g. sleep(), in terms of lower-level suspend and resume operations). 4. I/O operations This could be anything from low-level "yield fd_readable(sock)" style requests, or any of the higher-level APIs being discussed elsewhere. Whatever the exact API ends up being, the scheduler should implement these primitives by waiting for the I/O (or condition), and resuming the task with the result, if any. 5. Cooperative concurrency primitives, for working with locks, condition variables, and so on. (If useful?) 6. Custom, scheduler-specific instructions: Since a generator task can potentially yield anything as a scheduler instruction, it's not inconceivable for specialized schedulers to support specialized instructions. (Code that relies on such special instructions won't work on other schedulers, but that would be the point.) A question open to debate is what a scheduler should do when faced with an unrecognized scheduling instruction. Raising TypeError or NotImplementedError back into the task is probably a reasonable action, and would allow code like: def task(): try: yield fancy_magic_instruction() except NotImplementedError: yield from boring_fallback() ... Generator tasks as schedulers, and vice versa ============================================= Note that there is a symmetry to the protocol when a generator task calls another using "yield from": def task() spam = yield from subtask() Here, task() is both a generator task, and the effective scheduler for subtask(): it "implements" subtask()'s scheduling instructions by delegating them to its own scheduler. This is a plain observation on its own, however, it raises one or two interesting possibilities for more interesting schedulers implemented as generator tasks themselves, including: - Specialized sub-schedulers that run as a normal task within their parent scheduler, but implement for example weighted or priority queuing of their subtasks, or similar features. - "Adapter" schedulers that intercept special scheduler instructions (say, Deferreds or other library-specific objects), and implement them using more generic instructions to the underlying scheduler. -- Piet Delport