[Python-Dev] [PEP 3148] futures - execute computations asynchronously
Brian Quinlan
brian at sweetapp.com
Sat Mar 6 11:32:50 CET 2010
On 6 Mar 2010, at 17:50, Phillip J. Eby wrote:
> At 01:19 AM 3/6/2010, Jeffrey Yasskin wrote:
>> On Fri, Mar 5, 2010 at 10:11 PM, Phillip J. Eby <pje at telecommunity.com
>> > wrote:
>> > I'm somewhat concerned that, as described, the proposed API ...
>> [creates] yet another alternative (and
>> > mutually incompatible) event loop system in the stdlib ...
>>
>> Futures are a blocking construct; they don't involve an event loop.
>
> And where they block is in a loop, waiting for events (completed
> promises) coming back from other threads or processes.
>
> The Motivation section of the PEP also stresses avoiding reinvention
> of such loops, and points to the complication of using more than one
> at a time as a justification for the mechanism. It seems relevant
> to at least address why wrapping multiprocessing and multithreading
> is appropriate, but *not* dealing with any other form of sync/async
> boundary, *or* composition of futures.
>
> On which subject, I might add, the PEP is silent on whether
> executors are reentrant to the called code. That is, can I call a
> piece of code that uses futures, using the futures API? How will
> the called code know what executor to use? Must I pass it one
> explicitly? Will that work across threads and processes, without
> explicit support from the API?
Executors are reentrant but deadlock is possible. There are two
deadlock examples in the PEP.
>
> IOW, as far as I can tell from the PEP, it doesn't look like you can
> compose futures without *global* knowledge of the application...
> and in and of itself, this seems to negate the PEP's own motivation
> to prevent duplication of parallel execution handling!
>
> That is, if I use code from module A and module B that both want to
> invoke tasks asynchronously, and I want to invoke A and B
> asynchronously, what happens? Based on the design of the API, it
> appears there is nothing you can do except refactor A and B to take
> an executor in a parameter, instead of creating their own.
A and B could both use their own executor instances. You would need to
refactor A and B if you wanted to manage thread and process counts
globally.
> It seems therefore to me that either the proposal does not define
> its scope/motivation very well, or it is not well-equipped to
> address the problem it's setting out to solve. If it's meant to be
> something less ambitious -- more like a recipe or example -- it
> should properly motivate that scope. If it's intended to be a
> robust tool for composing different pieces of code, OTOH, it should
> absolutely address the issue of writing composable code... since,
> that seems to be what it says the purpose of the API is. (I.e.,
> composing code to use a common waiting loop.)
My original motivation when designing this module was having to deal
with a lot of code that looks like this:
def get_some_user_info(user):
x = make_ldap_call1(user)
y = make_ldap_call2(user)
z = [make_db_call(user, i) for i in something]
# Do some processing with x, y, z and return a result
Doing these operations serially is too slow. So how do I parallelize
them? Using the threading module is the obvious choice but having to
create my own work/result queue every time I encounter this pattern is
annoying. The futures module lets you write this as:
def get_some_user_info(user):
with ThreadPoolExecutor(max_threads=10) as executor:
x_future = executor.submit(make_ldap_call1, user)
y_future = executor.submit(make_ldap_call2, user)
z_futures = [executor.submit(make_db_call, user, i) for i in
something]
finished, _ = wait([x_future, y_future] + z_futures,
return_when=FIRST_EXCEPTION)
for f in finished:
if f.exception():
raise f.exception()
x = x_future.result()
y = y_future.result()
z = [f.result() for f in z_futures]
# Do some processing with x, y, z and return a result
> And, existing Python async APIs (such as Twisted's Deferreds)
> actually *address* this issue of composition; the PEP does not.
> Hence my comments about not looking at existing implementations for
> API and implementation guidance. (With respect to what the API
> needs, and how it needs to do it, not necessarily directly copying
> actual APIs or implementations. Certainly some of the Deferred API
> naming has a rather, um, "twisted" vocabulary.)
Using twisted (or any other asynchronous I/O framework) forces you to
rewrite your I/O code. Futures do not.
Cheers,
Brian
More information about the Python-Dev
mailing list