On Tue, Oct 16, 2012 at 3:39 PM, Greg Ewing greg.ewing@canterbury.ac.nz wrote:
In any case, even if we decide to provide a scheduler instruction to enable using for-loops on suspendable iterators somehow, it doesn't follow that we should use scheduler instructions for anything *else*.
The only additional operation needed is an async equivalent to the concurrent.futures.wait() API, which would allow you to provide a set of Futures and say "let me know when one of these operations are done" (http://docs.python.org/py3k/library/concurrent.futures#concurrent.futures.wa...)
As it turns out, this shouldn't *need* a new scheduler primitive in Guido's design, since it can be handled by hooking up an appropriate callback to the supplied future objects. Following code isn't tested, but given my understanding of how Guido wants things to work, it should do what I want:
def _wait_first(futures): # futures must be a set, items will be removed as they complete signal = Future() def chain_result(completed): futures.remove(completed) if completed.cancelled(): signal.cancel() signal.set_running_or_notify_cancel() elif completed.done(): signal.set_result(completed.result()) else: signal.set_exception(completed.exception()) for f in futures: f.add_done_callback(chain_result) return signal
def wait_first(futures): return _wait_first(set(futures))
def as_completed(futures): remaining = set(futures) while 1: if not remaining: break yield _wait_first(remaining)
@task def load_url_async(url) return url, (yield urllib.urlopen_async(url)).read()
@task def example(urls): for get_next_page in as_completed(load_url_async(url) for url in urls): try: url, data = yield get_next_page except Exception as exc: print("Something broke: {}".format(exc)) else: print("Loaded {} bytes from {!r}".format(len(data), url))
There's no scheduler instruction, there's just Guido's core API concept: the only thing a tasklet is allowed to yield is a Future object, and the step of registering tasks to be run is *always* done via an explicit call to the event loop rather than via the "yield" channel. The yield channel is only used to say "wait for this operation now".
What this approach means is that, to get sensible iteration, all you need is an ordinary iterator that produces future objects instead of reporting the results directly. You can then either call this operator with "yield from", in which case the individual results will be ignored and the first failure will abort the iteration, *or* you can invoke it with an explicit for loop, which will be enough to give you control over how exceptions are handled by means of an ordinary try/except block rather than a complicated exception chain.
Cheers, Nick.