[stdlib-sig] futures - a new package for asynchronous execution

Sat Nov 7 02:12:00 CET 2009

[I am going to be lazy and mass reply to people with a top-post; you
can burn an effigy of me later]

In response to Guido, yes, I sent Brian here with my PEP editor hat
on. Unfortunately the hat was on rather firmly and I totally forgot to
check to see how old the code is. Yet another reason I need to get the
Hg conversion done so I can start writing a "Adding to the Stdlib"
PEP.

To Antoine's Twisted comment, I don't see a direct comparison. From my
understanding Twisted's Deferred objects are ways to have callbacks
executed once an async event occurs, not to help execute code
concurrently. So I don't see enough similarity to discount the idea
Brian is pushing forward.

As for Benjamin's one year reminder, since I don't see this happening
in time for Python 3.2a1 it isn't a major worry right now. That means
this won't land until Python 3.3, which gives everyone time until that
alpha which would probably be June 2011. So that would give Brian (and
Jesse if he gets involved like it sounds he will) time to work out the
API, get public feedback, and get the code checked in. The only way I
would feel comfortable letting this in past 3.2a1 would be if it
landed before 3.2a4, Jesse personally shuttled it through, and started
work in it now and REALLY pushed it. But as he knows from personal
experience, rushing a module into the stdlib can bite you in the ass.
=)

But I really do like the idea. With java.util.concurrent and Grand
Central Dispatch out there, I think it shows some demand for a way to
easily abstract out concurrency management stuff and leave it up to a
library.

-Brett

On Fri, Nov 6, 2009 at 14:35, Brian Quinlan <brian at sweetapp.com> wrote:
> Hey all,
>
> I'd like to propose adding a module/package to Python that makes it easy to
> parallelize arbitrary function calls.
> I recently wrote a solution for the use case of parallelizing network copies
> and RPC using threads without forcing the user to explicitly creating thread
> pools, work queues, etc.
> I have a concrete implementation that I'll describe below but I'd be happy
> to hear about other strategies!
> The basic idea is to implement an asynchronous execution method patterned
> heavily on java.util.concurrent (but less lame because Python has functions
> as first-class objects).  Here is a fairly advanced example:
> import futures
> import functools
> import urllib.request
>
> URLS = [
>    'http://www.foxnews.com/',
>    'http://www.cnn.com/',
>    'http://europe.wsj.com/',
>    'http://www.bbc.co.uk/',
>    'http://some-made-up-domain.com/']
>
> def load_url(url, timeout):
>    return urllib.request.urlopen(url, timeout=timeout).read()
>
> # Use a thread pool with 5 threads to download the URLs. Using a pool
> # of processes would involve changing the initialization to:
> #   with futures.ProcessPoolExecutor(max_processes=5) as executor
> with futures.ThreadPoolExecutor(max_threads=5) as executor:
>    future_list = executor.run_to_futures(
>        [functools.partial(load_url, url, 30) for url in URLS])
>
> # Check the results of each future.
> for url, future in zip(URLS, future_list):
>    if future.exception() is not None:
>        print('%r generated an exception: %s' % (url, future.exception()))
>    else:
>        print('%r page is %d bytes' % (url, len(future.result())))
>
> In this example, executor.run_to_futures() returns only when every url has
> been retrieved but it is possible to return immediately, on the first
> completion or on the first failure depending on the desired work pattern.
>
> The complete docs are here:
> http://sweetapp.com/futures/
>
> A draft PEP is here:
> http://code.google.com/p/pythonfutures/source/browse/trunk/PEP.txt
>
> And the code is here:
> http://pypi.python.org/pypi/futures3/
>
> All feedback appreciated!
>
> Cheers,
> Brian
> _______________________________________________
> stdlib-sig mailing list
> stdlib-sig at python.org
> http://mail.python.org/mailman/listinfo/stdlib-sig
>
>