[Python-Dev] [PEP 3148] futures - execute computations asynchronously

Sat Mar 6 11:32:50 CET 2010

On 6 Mar 2010, at 17:50, Phillip J. Eby wrote:

> At 01:19 AM 3/6/2010, Jeffrey Yasskin wrote:
>> On Fri, Mar 5, 2010 at 10:11 PM, Phillip J. Eby <pje at telecommunity.com 
>> > wrote:
>> > I'm somewhat concerned that, as described, the proposed API ...  
>> [creates] yet another alternative (and
>> > mutually incompatible) event loop system in the stdlib ...
>>
>> Futures are a blocking construct; they don't involve an event loop.
>
> And where they block is in a loop, waiting for events (completed  
> promises) coming back from other threads or processes.
>
> The Motivation section of the PEP also stresses avoiding reinvention  
> of such loops, and points to the complication of using more than one  
> at a time as a justification for the mechanism.  It seems relevant  
> to at least address why wrapping multiprocessing and multithreading  
> is appropriate, but *not* dealing with any other form of sync/async  
> boundary, *or* composition of futures.
>
> On which subject, I might add, the PEP is silent on whether  
> executors are reentrant to the called code.  That is, can I call a  
> piece of code that uses futures, using the futures API?  How will  
> the called code know what executor to use?  Must I pass it one  
> explicitly?  Will that work across threads and processes, without  
> explicit support from the API?

Executors are reentrant but deadlock is possible. There are two  
deadlock examples in the PEP.

>
> IOW, as far as I can tell from the PEP, it doesn't look like you can  
> compose futures without *global* knowledge of the application...   
> and in and of itself, this seems to negate the PEP's own motivation  
> to prevent duplication of parallel execution handling!
>
> That is, if I use code from module A and module B that both want to  
> invoke tasks asynchronously, and I want to invoke A and B  
> asynchronously, what happens?  Based on the design of the API, it  
> appears there is nothing you can do except refactor A and B to take  
> an executor in a parameter, instead of creating their own.

A and B could both use their own executor instances. You would need to  
refactor A and B if you wanted to manage thread and process counts  
globally.

> It seems therefore to me that either the proposal does not define  
> its scope/motivation very well, or it is not well-equipped to  
> address the problem it's setting out to solve.  If it's meant to be  
> something less ambitious -- more like a recipe or example -- it  
> should properly motivate that scope.  If it's intended to be a  
> robust tool for composing different pieces of code, OTOH, it should  
> absolutely address the issue of writing composable code...  since,  
> that seems to be what it says the purpose of the API is.  (I.e.,  
> composing code to use a common waiting loop.)

My original motivation when designing this module was having to deal  
with a lot of code that looks like this:

def get_some_user_info(user):
   x = make_ldap_call1(user)
   y = make_ldap_call2(user)
   z = [make_db_call(user, i) for i in something]

   # Do some processing with x, y, z and return a result

Doing these operations serially is too slow. So how do I parallelize  
them? Using the threading module is the obvious choice but having to  
create my own work/result queue every time I encounter this pattern is  
annoying. The futures module lets you write this as:

def get_some_user_info(user):
   with ThreadPoolExecutor(max_threads=10) as executor:
     x_future = executor.submit(make_ldap_call1, user)
     y_future = executor.submit(make_ldap_call2, user)
     z_futures = [executor.submit(make_db_call, user, i) for i in  
something]
    finished, _ = wait([x_future, y_future] + z_futures,  
return_when=FIRST_EXCEPTION)
  for f in finished:
    if f.exception():
      raise f.exception()
  x = x_future.result()
  y = y_future.result()
  z = [f.result() for f in z_futures]

   # Do some processing with x, y, z and return a result

> And, existing Python async APIs (such as Twisted's Deferreds)  
> actually *address* this issue of composition; the PEP does not.   
> Hence my comments about not looking at existing implementations for  
> API and implementation guidance.  (With respect to what the API  
> needs, and how it needs to do it, not necessarily directly copying  
> actual APIs or implementations.  Certainly some of the Deferred API  
> naming has a rather, um, "twisted" vocabulary.)

Using twisted (or any other asynchronous I/O framework) forces you to  
rewrite your I/O code. Futures do not.

Cheers,
Brian