[Python-ideas] The async API of the future: yield-from

Tue Oct 16 02:13:58 CEST 2012

Nick Coghlan wrote:
> To me, "yield from" is just a tool that brings generators back to
> parity with functions when it comes to breaking up a larger algorithm
> into smaller pieces. Where you would break a function out into
> subfunctions and call them normally, with a generator you can break
> out subgenerators and invoke them with yield from.

That's exactly correct. It's the way I intended "yield from"
to be thought of right from the beginning.

What I'm arguing is that the *public api* for any suspendable
operation should be in the form of something called using
yield-from, because you never know when the implementation
might want to break it down into sub-operations and use
yield-from to call *them*.

> Any meaningful use of "yield from" in the coroutine context *has* to
> ultimate devolve to an operation that:
> 1. Asks the scheduler to schedule another operation
> 2. Waits for that operation to complete

I don't think I would put it quite that way. In my view
of things at least, the scheduler doesn't schedule "operations"
(in the sense of "read some bytes from this socket" etc.)
Rather, it schedules the running of tasks.

So the breakdown is really:

1. Start an operation (this doesn't involve the scheduler)
2. Ask the scheduler to suspend this task until the
    operation is finished

Also, this breakdown is only necessary at the very lowest
level, where you want to do something that isn't provided
in the form of a generator.

Obviously it's *possible* to treat each level of the call
chain as its own subtask, that you spawn independently and
then wait for it to finish. That's what people have done
in the past with their trampoline schedulers that interpret
yielded "call" and "return" instructions.

But one of the purposes of yield-from is to relieve the
scheduler of the need to handle things at that level of
granularity. It can treat a generator together with all
the subgenerators it might call as a *single* task, the
same way that a greenlet is thought of as a single task,
however many levels of function calls it might make.

> I *thought* Greg's way combined step 1 and step 2 into a single
> operation: the objects you yield *not only* say what you want to wait
> for, but also what you want to do.

I don't actually yield objects at all, but...

> However, his example par()
> implementation killed that idea, since it turned out to need to
> schedule tasks explicitly rather than their being a "execute this in
> parallel" option.

I don't see how that's a problem. Seems to me it's just as
easy for the user to call a par() function as it is to yield
a tuple of tasks. And providing this functionality using a
function means that different versions or options can be
made available for variations such as different ways of
handling exceptions. Using yield, you need to pick one of
the variations and bless it as being the one that you
invoke using special syntax.

If you're complaining that the implementation of par()
seems too complex, well, that complexity has to occur
*somewhere* -- if it's not in the par() function, then
it will turn up inside whatever part of the scheduler
handles the case that it's given a tuple of tasks.

> So now I'm back to think that Greg and Guido are talking about
> different levels. *Any* scheduling option will be able to be collapsed
> into an async task invoked by "yield from" by writing:
> 
>     def simple_async_task():
>         return yield start_task()

Yes... or another implementation that works some way
other than yielding instructions to the scheduler.

> I haven't seen anything to suggest that
> "yield from"'s role should change from what it is in 3.3: a way to
> factor out generators into multiple pieces with out breaking send()
> and throw().

I don't think anyone is suggesting that. I'm certainly not.

-- 
Greg