[Python-ideas] PEP 3156 feedback: wait_one vs par vs concurrent.futures.wait

Sat Dec 22 06:17:07 CET 2012

On Fri, Dec 21, 2012 at 8:46 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> I figure python-ideas is still the best place for PEP 3156 feedback -
> I think it's being revised too heavily for in-depth discussion on
> python-dev to be a good idea, and I think spinning out a separate list
> would lose too many people that are
> interested-but-not-enough-to-subscribe-to-yet-another-mailing-list
> (including me).
>
> The current draft of the PEP suggests the use of par() for the barrier
> operation (waiting for all futures and coroutines in a collection to
> be ready), while tentatively suggesting wait_one() as the API for
> waiting for the first completed operation in a collection. That
> inconsistency is questionable all by itself, but there's a greater
> stdlib level inconsistency that I find more concerning
>
> The corresponding blocking API in concurrent.futures is the module
> level "wait" function, which accepts a "return_when" parameter, with
> the permitted values FIRST_COMPLETED, FIRST_EXCEPTION and
> ALL_COMPLETED (the default). In the case where everything succeeds,
> FIRST_EXCEPTION is the same as ALL_COMPLETED. This function also
> accepts a timeout which allows the operation to finish early if the
> operations take too long.
>
> This flexibility also leads to a difference in the structure of the
> return type: concurrent.futures.wait always returns a pair of sets,
> with the first set being those futures which completed, while the
> second contains those which remaining incomplete at the time the call
> returned.
>
> It seems to me that this "wait" API can be applied directly to the
> equivalent problems in the async space, and, accordingly, *should* be
> applied so that the synchronous and asynchronous APIs remain as
> consistent as possible.

You've convinced me. I've never used the wait() and as_completed()
APIs in c.f, but you're right that with the exception of requiring
'yield from' they can be carried over exactly, and given that we're
doing the same thing with Future, this is eminently reasonable.

I may not get to implementing these for two weeks (I'll be traveling
without a computer) but they will not be forgotten.

--Guido

> The low level equivalent to par() would be:
>
>     incomplete = <tasks, futures or coroutines>
>     complete, incomplete = yield from tulip.wait(incomplete)
>     assert not incomplete # Without a timeout, everything should complete
>     for f in complete:
>         # Handle the completed operations
>
> Limiting the maximum execution time of any task to 10 seconds is
> straightforward:
>
>     incomplete = <tasks, futures or coroutines>
>     complete, incomplete = yield from tulip.wait(incomplete, timeout=10)
>     for f in incomplete:
>         f.cancel() # Took too long, kill it
>     for f in complete:
>         # Handle the completed operations
>
> The low level equivalent to the wait_one() example would become:
>
>     incomplete = <tasks, futures or coroutines>
>     while incomplete:
>         complete, incomplete = yield from tulip.wait(incomplete,
> return_when=FIRST_COMPLETED)
>         for f in complete:
>             # Handle the completed operations
>
> par() becomes easy to define as a coroutine:
>
>     @coroutine
>     def par(fs):
>         complete, incomplete = yield from tulip.wait(fs,
> return_when=FIRST_EXCEPTION)
>         for f in incomplete:
>             f.cancel() # Something must have failed, so cancel the rest
>         # If something failed, calling f.result() will raise that exception
>         return [f.result() for f in complete]
>
> Defining wait_one() is also straightforward (although it isn't clearly
> superior to just
> using the underlying API directly):
>
>     @coroutine
>     def wait_one(fs):
>         complete, incomplete = yield from tulip.wait(fs,
> return_when=FIRST_COMPLETED)
>         return complete.pop()
>
> The async equivalent to "as_completed" under this scheme is far more
> interesting, as it would be an iterator that produces coroutines:
>
>     def as_completed(fs):
>         incomplete = fs
>         while incomplete:
>             # Phase 1 of the loop, we yield a coroutine that actually
> starts operations running
>             @coroutine
>             def _wait_for_some():
>                 nonlocal complete, incomplete
>                 complete, incomplete = yield from tulip.wait(fs,
> return_when=FIRST_COMPLETED)
>                 return complete.pop().result()
>             yield _wait_for_some()
>             # Phase 2 of the loop, we pass back the already complete operations
>             while complete:
>                 # Note this use case for @coroutine *forcing* objects
> to behave like a generator,
>                 # as well as exploiting the ability to avoid trips
> around the event loop
>                 @coroutine
>                 def _next_result():
>                     return complete.pop().result()
>                 yield _next_result()
>
>     # This is almost as easy to use as the synchronous equivalent, the
> only difference
>     # is the use of "yield from f" instead of the synchronous "f.result()"
>     for f in as_completed(fs):
>         next = yield from f
>
> Cheers,
> Nick.
>
> --
> Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas

-- 
--Guido van Rossum (python.org/~guido)