On Tue, Oct 16, 2012 at 12:20 AM, Greg Ewing firstname.lastname@example.org wrote:
Guido van Rossum wrote:
But there needs to be another way to get a task running immediately and concurrently; I believe that would be
a = spawn(foo_task())
right? One could then at any later point use
ra = yield from a
Hmmm. I suppose it *could* be made to work that way, but I'm not sure it's a good idea, because it blurs the distinction between invoking a subtask synchronously and waiting for the result of a previously spawned independent task.
Are you sure you really want to distinguish between those though? In NDB they are intentionally the same -- invoking some API whose name ends in _async() starts an async subtask and returns a Future; you wait for the subtask by yielding the Future.
Starting multiple tasks is just a matter of calling several _async() APIs; then you can wait for any or all of them using yield [future1, future2, ...] *or* by yielding the futures one at a time. This gives users a gentle introduction to concurrency (first they use the synchronous APIs; then they learn to use yield foo_async(); then they learn they can write:
f = foo_async() <other work> r = yield f
and finally they learn about spawning multiple tasks:
f1 = foo_async() f2 = bar_async() rfoo, rbar = yield f1, f2
Recently I've been thinking about an implementation where it would look like this. First you do
t = spawn(foo_task())
but what you get back is *not* a generator; rather it's a Task object which wraps a generator and provides various operations. One of them would be
r = yield from t.wait()
which waits for the task to complete and then returns its value (or if it raised an exception, propagates the exception).
Other operations that a Task object might support include
t.unblock() # wake up a blocked task t.cancel() # unschedule and clean up the task t.throw(exception) # raise an exception in the task
(I haven't included t.block(), because I think that should be a stand-alone function that operates on the current task. Telling some other task to block feels like a dodgy thing to do.)
Right. I'm looking forward to a larger example.
One could also combine these and do e.g.
a = spawn(foo_task()) b = spawn(bar_task())
<do more work locally> ra, rb = yield from par(a, b)
If you're happy to bail out at the first exception, you wouldn't strictly need a par() function for this, you could just do
a = spawn(foo_task()) b = spawn(bar_task()) ra = yield from a.wait() rb = yield from b.wait()
Have I got the spelling for spawn() right? In many other systems (e.g. threads, greenlets) this kind of operation takes a callable, not the result of calling a function (albeit a generator).
That's a result of the fact that a generator doesn't start running as soon as you call it. If you don't like that, the spawn() operation could be defined to take an uncalled generator and make the call for you. But I think it's useful to make the call yourself, because it gives you an opportunity to pass parameters to the task.
Agreed, actually. I was just checking.
If it takes a generator, would it return the same generator or a different one to wait for?
In your version above where you wait for the task simply by calling it with yield-from, spawn() would have to return a generator (or something with the same interface). But it couldn't be the same generator -- it would have to be a wrapper that takes care of blocking until the subtask is finished.
That's fine with me (though Glyph would worry about creating too many objects).