[stdlib-sig] futures - a new package for asynchronous execution
Brian Quinlan
brian at sweetapp.com
Fri Nov 13 03:38:33 CET 2009
Hey all,
I compiled a summary of people's feedback (about technical issues - I
agree that the docs could be better but agreeing on the API seems like
the first step) and have some API change proposals.
Here is a summary of the feedback:
- Use Twisted Deferreds rather than Futures
- The API too complex
- Make Future a callable and drop the .result()/.exception() methods
- Remove .wait() from Executor
- Make it easy to process results in the order of completion rather
than in the order that the futures were generated
- Executor context managers should wait until their workers complete
before exiting
- Extract Executor.map, etc. into separate functions/modules
- FutureList has too many methods or is not necessary
- Executor should have an easy way to produce a single future
- Should be able to wait on an arbitrary list of futures
- Should have a way of avoiding deadlock (will follow-up on this
separately)
Here is what I suggest as far as API changes (the docs suck, I'll
polish them when we reach consensus):
FutureList is eliminated completely.
Future remains unchanged - I disagree that Deferreds would be better,
that .exception() is not useful, and that .result() should be
renamed .get() or .__call__(). But I am easily persuadable :-)
The Executor ABC is simplified to only contain a single method:
def Executor.submit(self, fn, *args, **kwargs) :
Submits a call for execution and returns a Future representing the
pending results of fn(*args, **kwargs)
map becomes a utility function:
def map(executor, *iterables, timeout=None)
Equivalent to map(func, *iterables) but executed asynchronously and
possibly out-of-order. The returned iterator raises a TimeoutError if
__next__() is called and the result isn’t available after timeout
seconds from the original call to run_to_results(). If timeout is not
specified or None then there is no limit to the wait time. If a call
raises an exception then that exception will be raised when its value
is retrieved from the iterator.
wait becomes a utility function that can wait on any iterable of
Futures:
def wait(futures, return_when=ALL_COMPLETED)
Wait until the given condition is met for the given futures. This
method should always be called using keyword arguments, which are:
timeout can be used to control the maximum number of seconds to wait
before returning. If timeout is not specified or None then there is no
limit to the wait time.
return_when indicates when the method should return. It must be one of
the following constants:
NEXT_COMPLETED
NEXT_EXCEPTION
ALL_COMPLETED
a new utility function is added that iterates over the given Futures
and returns the as they are completed:
def itercompleted(futures, timeout=None):
Returns an iterator that returns a completed Future from the given
list when __next__() is called. If no Futures are completed then
__next__() is called then __next__() waits until one does complete.
Raises a TimeoutError if __next__() is called and no completed future
is available after timeout seconds from the original call.
The URL loading example becomes:
import functools
import urllib.request
import futures
URLS = ['http://www.foxnews.com/',
'http://www.cnn.com/',
'http://europe.wsj.com/',
'http://www.bbc.co.uk/',
'http://some-made-up-domain.com/']
def load_url(url, timeout):
return urllib.request.urlopen(url, timeout=timeout).read()
with futures.ThreadPoolExecutor(50) as executor:
fs = [executor.submit(load_url, url, timeout=30) for url in URLS]
for future in futures.itercompleted(fs):
if future.exception() is not None:
print('%r generated an exception: %s' % (url,
future.exception()))
else:
print('%r page is %d bytes' % (url, len(future.result())))
What do you think? Are we moving in the right direction?
Cheers,
Brian
More information about the stdlib-sig
mailing list