[Python-ideas] The async API of the future: yield-from

Tue Oct 16 22:18:02 CEST 2012

On Tue, Oct 16, 2012 at 12:20 AM, Greg Ewing
<greg.ewing at canterbury.ac.nz> wrote:
> Guido van Rossum wrote:
>
>> But there needs to be another way to get a task running immediately
>> and concurrently; I believe that would be
>>
>> a = spawn(foo_task())
>>
>> right? One could then at any later point use
>>
>> ra = yield from a
>
>
> Hmmm. I suppose it *could* be made to work that way, but I'm
> not sure it's a good idea, because it blurs the distinction
> between invoking a subtask synchronously and waiting for the
> result of a previously spawned independent task.

Are you sure you really want to distinguish between those though? In
NDB they are intentionally the same -- invoking some API whose name
ends in _async() starts an async subtask and returns a Future; you
wait for the subtask by yielding the Future.

Starting multiple tasks is just a matter of calling several _async()
APIs; then you can wait for any or all of them using yield [future1,
future2, ...] *or* by yielding the futures one at a time. This gives
users a gentle introduction to concurrency (first they use the
synchronous APIs; then they learn to use yield foo_async(); then they
learn they can write:

f = foo_async()
<other work>
r = yield f

and finally they learn about spawning multiple tasks:

f1 = foo_async()
f2 = bar_async()
rfoo, rbar = yield f1, f2

> Recently I've been thinking about an implementation where
> it would look like this. First you do
>
>    t = spawn(foo_task())
>
> but what you get back is *not* a generator; rather it's
> a Task object which wraps a generator and provides various
> operations. One of them would be
>
>    r = yield from t.wait()
>
> which waits for the task to complete and then returns its
> value (or if it raised an exception, propagates the exception).
>
> Other operations that a Task object might support include
>
>    t.unblock()        # wake up a blocked task
>    t.cancel()         # unschedule and clean up the task
>    t.throw(exception) # raise an exception in the task
>
> (I haven't included t.block(), because I think that should
> be a stand-alone function that operates on the current task.
> Telling some other task to block feels like a dodgy thing
> to do.)

Right. I'm looking forward to a larger example.

>> One could also combine these and do e.g.
>>
>> a = spawn(foo_task())
>> b = spawn(bar_task())
>> <do more work locally>
>> ra, rb = yield from par(a, b)
>
>
> If you're happy to bail out at the first exception, you
> wouldn't strictly need a par() function for this, you could
> just do
>
>
>    a = spawn(foo_task())
>    b = spawn(bar_task())
>    ra = yield from a.wait()
>    rb = yield from b.wait()
>
>
>> Have I got the spelling for spawn() right? In many other systems (e.g.
>> threads, greenlets) this kind of operation takes a callable, not the
>> result of calling a function (albeit a generator).
>
>
> That's a result of the fact that a generator doesn't start
> running as soon as you call it. If you don't like that, the
> spawn() operation could be defined to take an uncalled generator
> and make the call for you. But I think it's useful to make the
> call yourself, because it gives you an opportunity to pass
> parameters to the task.

Agreed, actually. I was just checking.

>> If it takes a
>> generator, would it return the same generator or a different one to
>> wait for?
>
>
> In your version above where you wait for the task simply
> by calling it with yield-from, spawn() would have to return a
> generator (or something with the same interface). But it
> couldn't be the same generator -- it would have to be a wrapper
> that takes care of blocking until the subtask is finished.

That's fine with me (though Glyph would worry about creating too many objects).

-- 
--Guido van Rossum (python.org/~guido)