[Python-ideas] The async API of the future: yield-from

Nick Coghlan ncoghlan at gmail.com
Tue Oct 16 09:48:24 CEST 2012


On Tue, Oct 16, 2012 at 3:39 PM, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> In any case, even if we decide to provide a scheduler
> instruction to enable using for-loops on suspendable
> iterators somehow, it doesn't follow that we should use
> scheduler instructions for anything *else*.

The only additional operation needed is an async equivalent to the
concurrent.futures.wait() API, which would allow you to provide a set
of Futures and say "let me know when one of these operations are done"
(http://docs.python.org/py3k/library/concurrent.futures#concurrent.futures.wait)

As it turns out, this shouldn't *need* a new scheduler primitive in
Guido's design, since it can be handled by hooking up an appropriate
callback to the supplied future objects. Following code isn't tested,
but given my understanding of how Guido wants things to work, it
should do what I want:

    def _wait_first(futures):
        # futures must be a set, items will be removed as they complete
        signal = Future()
        def chain_result(completed):
            futures.remove(completed)
            if completed.cancelled():
                signal.cancel()
                signal.set_running_or_notify_cancel()
            elif completed.done():
                signal.set_result(completed.result())
            else:
                signal.set_exception(completed.exception())
        for f in futures:
            f.add_done_callback(chain_result)
        return signal

    def wait_first(futures):
        return _wait_first(set(futures))

    def as_completed(futures):
        remaining = set(futures)
        while 1:
            if not remaining:
                break
            yield _wait_first(remaining)

    @task
    def load_url_async(url)
        return url, (yield urllib.urlopen_async(url)).read()

    @task
    def example(urls):
        for get_next_page in as_completed(load_url_async(url) for url in urls):
            try:
                url, data = yield get_next_page
            except Exception as exc:
                print("Something broke: {}".format(exc))
            else:
                print("Loaded {} bytes from {!r}".format(len(data), url))

There's no scheduler instruction, there's just Guido's core API
concept: the only thing a tasklet is allowed to yield is a Future
object, and the step of registering tasks to be run is *always* done
via an explicit call to the event loop rather than via the "yield"
channel. The yield channel is only used to say "wait for this
operation now".

What this approach means is that, to get sensible iteration, all you
need is an ordinary iterator that produces future objects instead of
reporting the results directly. You can then either call this operator
with "yield from", in which case the individual results will be
ignored and the first failure will abort the iteration, *or* you can
invoke it with an explicit for loop, which will be enough to give you
control over how exceptions are handled by means of an ordinary
try/except block rather than a complicated exception chain.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia



More information about the Python-ideas mailing list