[stdlib-sig] futures - a new package for asynchronous execution

Mon Feb 22 04:37:23 CET 2010

Where's the current version of the PEP?

On Sun, Feb 21, 2010 at 1:47 AM, Brian Quinlan <brian at sweetapp.com> wrote:
>
> On 21 Feb 2010, at 14:41, Jeffrey Yasskin wrote:
>
>> Several comments:
>>
>> * I see you using the Executors as context managers, but no mention in
>> the specification about what that does.
>
> I can't see such documentation for built-in Python objects. To be
> symmetrical with the built-in file object, i've documented the context
> manager behavior as part of the Executor.shutdown method.

For locks, it has its own section:
http://docs.python.org/library/threading.html#using-locks-conditions-and-semaphores-in-the-with-statement
But I don't care too much about the formatting as long as the PEP
specifies it clearly.

>> You need to specify it. (Your
>> current implementation doesn't wait in __exit__, which I think is the
>> opposite of what you agreed with Antoine, but you can fix that after
>> we get general agreement on the interface.)
>
> Fixed.
>
>> * I'd like users to be able to write Executors besides the simple
>> ThreadPoolExecutor and ProcessPoolExecutor you already have. To enable
>> that, could you document what the subclassing interface for Executor
>> looks like? that is, what code do user-written Executors need to
>> include?
>
> I can do that.
>
>> I don't think it should include direct access to
>> future._state like ThreadPoolExecutor uses, if at all possible.
>
> Would it be reasonable to make Future an ABC, make a _Future that subclasses
> it for internal usage and let other Executor subclasses define their own
> Futures.

What interface are you proposing for the Future ABC? It'll need to
support wait() and as_completed() from non-library Futures. I wouldn't
mind making the type just a duck-type (it probably wouldn't even need
an ABC), although I'd like to give people trying to implement their
own Executors as much help as possible. I'd assumed that giving Future
some public hooks would be easier than fixing the wait() interface,
but I could be wrong.

>> * Could you specify in what circumstances a pure computational
>> Future-based program may deadlock? (Ideally, that would be "never".)
>> Your current implementation includes two such deadlocks, for which
>> I've attached a test.
>
>> * Do you want to make calling Executor.shutdown(wait=True) from within
>> the same Executor 1) detect the problem and raise an exception, 2)
>> deadlock, 3) unspecified behavior, or 4) wait for all other threads
>> and then let the current one continue?
>
> What about a note saying that using any futures functions or methods from
> inside a scheduled call is likely to lead to deadlock unless care is taken?

Jesse pointed out that one of the first things people try to do when
using concurrency libraries is to try to use them inside themselves.
I've also tried to use a futures library that forbade nested use
('cause I wrote it), and it was a real pain.

It should be easy enough to detect that the caller of
Executor.shutdown is one of the Executor's threads or processes, but I
wouldn't mind making the obviously incorrect "wait for my own
completion" deadlock or throw an exception, and it would make sense to
give Executor implementors their choice of which to do.

>> * This is a nit, but I think that the parameter names for
>> ThreadPoolExecutor and ProcessPoolExecutor should be the same so
>> people can parametrize their code on those constructors. Right now
>> they're "max_threads" and "max_processes", respectively. I might
>> suggest "max_workers".
>
> I'm not sure that I like that. In general consolidating the constructors for
> executors is not going to be possible.

In general, yes, but in this case they're the same, and we should try
to avoid gratuitous differences.

>> * You should document the exception that happens when you try to pass
>> a ProcessPoolExecutor as an argument to a task executing inside
>> another ProcessPoolExecutor, or make it not throw an exception and
>> document that.
>
> The ProcessPoolExecutor limitations are the same as the multiprocessing
> limitations. I can provide a note about that and a link to that module's
> documentation.

And multiprocessing doesn't document that its Pool requires
picklability and isn't picklable itself. Saying that the
ProcessPoolExecutor is equivalent to a multiprocessing.Pool should be
enough for your PEP.

>> * If it's intentional, you should probably document that if one
>> element of a map() times out, there's no way to come back and wait
>> longer to retrieve it or later elements.
>
> That's not obvious?

Maybe.

>> * You still mention run_to_futures, run_to_results, and FutureList,
>> even though they're no longer proposed.
>
> Done.
>
>>
>> * wait() should probably return a named_tuple or an object so we don't
>> have people writing the unreadable "wait(fs)[0]".
>
> Done.
>
>>
>> * Instead of "call finishes" in the description of the return_when
>> parameter, you might describe the behavior in terms of futures
>> becoming done since that's the accessor function you're using.
>
> Done.
>
>
>> * Is RETURN_IMMEDIATELY just a way to categorize futures into done and
>> not? Is that useful over [f for f in fs if f.done()]?
>
> That was an artifact of the previous implementation; removed.
>
>> * After shutdown, is RuntimeError the right exception, or should there
>> be a more specific exception?
>
> RunTimeError is what is raised in similar situations by threading e.g. when
> starting an already started thread.

Ok, works for me.

On Sun, Feb 21, 2010 at 5:49 AM, Brian Quinlan <brian at sweetapp.com> wrote:
> A few extra points.
>
> On 21 Feb 2010, at 14:41, Jeffrey Yasskin wrote:
>>
>> * I'd like users to be able to write Executors besides the simple
>> ThreadPoolExecutor and ProcessPoolExecutor you already have. To enable
>> that, could you document what the subclassing interface for Executor
>> looks like? that is, what code do user-written Executors need to
>> include? I don't think it should include direct access to
>> future._state like ThreadPoolExecutor uses, if at all possible.
>
> One of the difficulties here is:
> 1. i don't want to commit to the internal implementation of Futures

Yep, that's why to avoid requiring them to have direct access to the
internal variables.

> 2. it might be hard to make it clear which methods are public to users and
> which methods are public to executor implementors

One way to do it would be to create another type for implementors and
pass it to the Future constructor.

>> * Could you specify in what circumstances a pure computational
>> Future-based program may deadlock? (Ideally, that would be "never".)
>> Your current implementation includes two such deadlocks, for which
>> I've attached a test.
>
> Thanks for the tests but I wasn't planning on changing this behavior. I
> don't really like the idea of using the calling thread to perform the wait
> because:
> 1. not all executors will be able to implement that behavior

Why not? Thread pools can implement it, and process pools make it
impossible to create cycles, so they also can't deadlock.

> 2. it can only be made to work if no wait time is specified

With a wait time, you have to avoid stealing work, but it's also
guaranteed not to deadlock, so it's fine.