[Python-ideas] The async API of the future: yield-from

Mon Oct 15 17:53:49 CEST 2012

On Sun, Oct 14, 2012 at 10:58 PM, Greg Ewing
<greg.ewing at canterbury.ac.nz> wrote:
> Guido van Rossum wrote:
>
>> Why wouldn't all generators that aren't blocked for I/O just run until
>> their next yield, in a round-robin fashion? That's fair enough for me.
>>
>> But as I said, my intuition for how things work in Greg's world is not
>> very good.
>
>
> That's exactly how my scheduler behaves.
>
>
>> OTOH I am okay with only getting one of the exceptions. But I think
>> all of the remaining tasks should still be run to completion -- maybe
>> the caller just cared about their side effects. Or maybe this should
>> be an option to par().
>
>
> This is hard to answer without considering real use cases,
> but my feeling is that if I care enough about the results of
> the subtasks to wait until they've all completed before continuing,
> then if anything goes wrong in any of them, I might as well abandon
> the whole computation.
>
> If that's not the case, I'd be happy to wrap each one in a
> try-except that doesn't propagate the exception to the main
> task, but just records the information that the subtask
> failed somewhere, for the main task to check afterwards.
>
> Another direction to approach this is to consider that par()
> ought to be just an optimisation -- the result should be the same
> as if you'd written sequential code to perform the subtasks
> one after another. And in that case, an exception in one would
> prevent any of the following ones from executing, so it's fine
> if par() behaves like that, too.

I'd think of such a par() more as something that saves me typing than
as an optimization. Anyway, the key functionality I cannot live
without here is to start multiple tasks concurrently. It seems that
without par() or some other scheduling primitive, you cannot do that:
if I write

a = foo_task()  # Search google
b = bar_task()  # Search bing
ra = yield from a
rb = yield from b
# now compare search results

the tasks run sequentially. A good par() should run then concurrently.
But there needs to be another way to get a task running immediately
and concurrently; I believe that would be

a = spawn(foo_task())

right? One could then at any later point use

ra = yield from a

One could also combine these and do e.g.

a = spawn(foo_task())
b = spawn(bar_task())
<do more work locally>
ra, rb = yield from par(a, b)

Have I got the spelling for spawn() right? In many other systems (e.g.
threads, greenlets) this kind of operation takes a callable, not the
result of calling a function (albeit a generator). If it takes a
generator, would it return the same generator or a different one to
wait for?

-- 
--Guido van Rossum (python.org/~guido)