Re: [Python-Dev] [PEP 3148] futures - execute computations asynchronously

At 01:19 AM 3/6/2010, Jeffrey Yasskin wrote:
On Fri, Mar 5, 2010 at 10:11 PM, Phillip J. Eby <pje@telecommunity.com> wrote:
I'm somewhat concerned that, as described, the proposed API ... [creates] yet another alternative (and mutually incompatible) event loop system in the stdlib ...
Futures are a blocking construct; they don't involve an event loop.
And where they block is in a loop, waiting for events (completed promises) coming back from other threads or processes. The Motivation section of the PEP also stresses avoiding reinvention of such loops, and points to the complication of using more than one at a time as a justification for the mechanism. It seems relevant to at least address why wrapping multiprocessing and multithreading is appropriate, but *not* dealing with any other form of sync/async boundary, *or* composition of futures. On which subject, I might add, the PEP is silent on whether executors are reentrant to the called code. That is, can I call a piece of code that uses futures, using the futures API? How will the called code know what executor to use? Must I pass it one explicitly? Will that work across threads and processes, without explicit support from the API? IOW, as far as I can tell from the PEP, it doesn't look like you can compose futures without *global* knowledge of the application... and in and of itself, this seems to negate the PEP's own motivation to prevent duplication of parallel execution handling! That is, if I use code from module A and module B that both want to invoke tasks asynchronously, and I want to invoke A and B asynchronously, what happens? Based on the design of the API, it appears there is nothing you can do except refactor A and B to take an executor in a parameter, instead of creating their own. It seems therefore to me that either the proposal does not define its scope/motivation very well, or it is not well-equipped to address the problem it's setting out to solve. If it's meant to be something less ambitious -- more like a recipe or example -- it should properly motivate that scope. If it's intended to be a robust tool for composing different pieces of code, OTOH, it should absolutely address the issue of writing composable code... since, that seems to be what it says the purpose of the API is. (I.e., composing code to use a common waiting loop.) And, existing Python async APIs (such as Twisted's Deferreds) actually *address* this issue of composition; the PEP does not. Hence my comments about not looking at existing implementations for API and implementation guidance. (With respect to what the API needs, and how it needs to do it, not necessarily directly copying actual APIs or implementations. Certainly some of the Deferred API naming has a rather, um, "twisted" vocabulary.)

On 6 Mar 2010, at 17:50, Phillip J. Eby wrote:
At 01:19 AM 3/6/2010, Jeffrey Yasskin wrote:
On Fri, Mar 5, 2010 at 10:11 PM, Phillip J. Eby <pje@telecommunity.com
wrote: I'm somewhat concerned that, as described, the proposed API ... [creates] yet another alternative (and mutually incompatible) event loop system in the stdlib ...
Futures are a blocking construct; they don't involve an event loop.
And where they block is in a loop, waiting for events (completed promises) coming back from other threads or processes.
The Motivation section of the PEP also stresses avoiding reinvention of such loops, and points to the complication of using more than one at a time as a justification for the mechanism. It seems relevant to at least address why wrapping multiprocessing and multithreading is appropriate, but *not* dealing with any other form of sync/async boundary, *or* composition of futures.
On which subject, I might add, the PEP is silent on whether executors are reentrant to the called code. That is, can I call a piece of code that uses futures, using the futures API? How will the called code know what executor to use? Must I pass it one explicitly? Will that work across threads and processes, without explicit support from the API?
Executors are reentrant but deadlock is possible. There are two deadlock examples in the PEP.
IOW, as far as I can tell from the PEP, it doesn't look like you can compose futures without *global* knowledge of the application... and in and of itself, this seems to negate the PEP's own motivation to prevent duplication of parallel execution handling!
That is, if I use code from module A and module B that both want to invoke tasks asynchronously, and I want to invoke A and B asynchronously, what happens? Based on the design of the API, it appears there is nothing you can do except refactor A and B to take an executor in a parameter, instead of creating their own.
A and B could both use their own executor instances. You would need to refactor A and B if you wanted to manage thread and process counts globally.
It seems therefore to me that either the proposal does not define its scope/motivation very well, or it is not well-equipped to address the problem it's setting out to solve. If it's meant to be something less ambitious -- more like a recipe or example -- it should properly motivate that scope. If it's intended to be a robust tool for composing different pieces of code, OTOH, it should absolutely address the issue of writing composable code... since, that seems to be what it says the purpose of the API is. (I.e., composing code to use a common waiting loop.)
My original motivation when designing this module was having to deal with a lot of code that looks like this: def get_some_user_info(user): x = make_ldap_call1(user) y = make_ldap_call2(user) z = [make_db_call(user, i) for i in something] # Do some processing with x, y, z and return a result Doing these operations serially is too slow. So how do I parallelize them? Using the threading module is the obvious choice but having to create my own work/result queue every time I encounter this pattern is annoying. The futures module lets you write this as: def get_some_user_info(user): with ThreadPoolExecutor(max_threads=10) as executor: x_future = executor.submit(make_ldap_call1, user) y_future = executor.submit(make_ldap_call2, user) z_futures = [executor.submit(make_db_call, user, i) for i in something] finished, _ = wait([x_future, y_future] + z_futures, return_when=FIRST_EXCEPTION) for f in finished: if f.exception(): raise f.exception() x = x_future.result() y = y_future.result() z = [f.result() for f in z_futures] # Do some processing with x, y, z and return a result
And, existing Python async APIs (such as Twisted's Deferreds) actually *address* this issue of composition; the PEP does not. Hence my comments about not looking at existing implementations for API and implementation guidance. (With respect to what the API needs, and how it needs to do it, not necessarily directly copying actual APIs or implementations. Certainly some of the Deferred API naming has a rather, um, "twisted" vocabulary.)
Using twisted (or any other asynchronous I/O framework) forces you to rewrite your I/O code. Futures do not. Cheers, Brian

Brian Quinlan wrote:
IOW, as far as I can tell from the PEP, it doesn't look like you can compose futures without *global* knowledge of the application... and in and of itself, this seems to negate the PEP's own motivation to prevent duplication of parallel execution handling!
That is, if I use code from module A and module B that both want to invoke tasks asynchronously, and I want to invoke A and B asynchronously, what happens? Based on the design of the API, it appears there is nothing you can do except refactor A and B to take an executor in a parameter, instead of creating their own.
A and B could both use their own executor instances. You would need to refactor A and B if you wanted to manage thread and process counts globally.
You may want to consider providing global thread and process executors in the futures module itself. Code which just wants to say "do this in the background" without having to manage the lifecycle of its own executor instance is then free to do so. I've had a lot of experience with a framework that provides this and it is *very* convenient (it's also a good way to avoid deadlocks due to synchronous notification APIs). On PJE's broader point, async event loops with non-blocking I/O and messages passed back to the event loop to indicate completion of operations and relying on threads and processes to farm out tasks (which you will later block on in order to retrieve the results) are completely different programming models. This PEP doesn't change that - it just makes certain aspects of the latter approach easier to handle. Trying to design an API that can cope with either model strikes me as a fool's errand. They differ at such a fundamental level that I don't see how a hybrid API could be particularly optimal for either approach. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

On Sat, Mar 6, 2010 at 6:43 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
You may want to consider providing global thread and process executors in the futures module itself. Code which just wants to say "do this in the background" without having to manage the lifecycle of its own executor instance is then free to do so.
+1 -- Daniel Stutzbach, Ph.D. President, Stutzbach Enterprises, LLC <http://stutzbachenterprises.com>

I have been playing with the feedback branch of this package for py3 and there seems to be a rather serious bug in the Process version. Using the code @ http://dpaste.com/hold/168795/ When I was running in debug mode I found that as soon as p = multiprocessing.Process( target=_process_worker, args=(self._call_queue, self._result_queue, self._shutdown_process_event)) was called (yes even before p.start() was called) the processes just started launching all by themselves. I am also wondering why you are launching the process directly instead of using a Pool since you are limiting the number of processes always wouldnt it be better to launch the worker processes up front then just add worker items to the queue?

I am also getting an odd error on odd error on exit Error in atexit._run_exitfuncs: TypeError: print_exception(): Exception expected for value, str found

On Mar 6, 2010, at 4:20 PM, Dj Gilcrease <digitalxero@gmail.com> wrote:
I have been playing with the feedback branch of this package for py3 and there seems to be a rather serious bug in the Process version. Using the code @ http://dpaste.com/hold/168795/
When I was running in debug mode I found that as soon as
p = multiprocessing.Process( target=_process_worker, args=(self._call_queue, self._result_queue, self._shutdown_process_event))
was called (yes even before p.start() was called) the processes just started launching all by themselves.
Did you run the provided example code on windows by chance? If so, look at the multiprocessing docs, there are restrictions on windows (see the __main__ note) - not following the guidelines can result in lots of processes spawning.

On Sat, Mar 6, 2010 at 2:58 PM, Jesse Noller <jnoller@gmail.com> wrote:
Did you run the provided example code on windows by chance? If so, look at the multiprocessing docs, there are restrictions on windows (see the __main__ note) - not following the guidelines can result in lots of processes spawning.
Yes, on win7. I might recommend then that the examples in the PEP be restructured to work correctly on windows by including the if __name__ == '__main__':

On 7 Mar 2010, at 03:04, Phillip J. Eby wrote:
At 05:32 AM 3/6/2010, Brian Quinlan wrote:
Using twisted (or any other asynchronous I/O framework) forces you to rewrite your I/O code. Futures do not.
Twisted's "Deferred" API has nothing to do with I/O.
I see, you just mean the API and not the underlying model. We discussed the Deferred API on the stdlib-sig and I don't think that anyone expressed a preference for it over the one described in the PEP. Do you have any concrete criticism? Cheers, Brian

On 02:10 am, brian@sweetapp.com wrote:
On 7 Mar 2010, at 03:04, Phillip J. Eby wrote:
At 05:32 AM 3/6/2010, Brian Quinlan wrote:
Using twisted (or any other asynchronous I/O framework) forces you to rewrite your I/O code. Futures do not.
Twisted's "Deferred" API has nothing to do with I/O.
I see, you just mean the API and not the underlying model.
We discussed the Deferred API on the stdlib-sig and I don't think that anyone expressed a preference for it over the one described in the PEP.
Do you have any concrete criticism?
From reading some of the stdlib-sig archives, it sounds like there is general agreement that Deferreds and Futures can be used to complement each other, and that getting code that is primarily Deferred-based to integrate with Future-based code or vice versa should eventually be possible. Do I have the right sense of people's feelings? And relatedly, once Futures are accepted and implemented, are people going to use them as an argument to exclude Deferreds from the stdlib (or be swayed by other people making such arguments)? Hopefully not, given what I read on stdlib-sig, but it doesn't hurt to check... Jean-Paul

On Sat, Mar 6, 2010 at 9:34 PM, <exarkun@twistedmatrix.com> wrote:
On 02:10 am, brian@sweetapp.com wrote:
On 7 Mar 2010, at 03:04, Phillip J. Eby wrote:
At 05:32 AM 3/6/2010, Brian Quinlan wrote:
Using twisted (or any other asynchronous I/O framework) forces you to rewrite your I/O code. Futures do not.
Twisted's "Deferred" API has nothing to do with I/O.
I see, you just mean the API and not the underlying model.
We discussed the Deferred API on the stdlib-sig and I don't think that anyone expressed a preference for it over the one described in the PEP.
Do you have any concrete criticism?
From reading some of the stdlib-sig archives, it sounds like there is general agreement that Deferreds and Futures can be used to complement each other, and that getting code that is primarily Deferred-based to integrate with Future-based code or vice versa should eventually be possible.
Do I have the right sense of people's feelings?
And relatedly, once Futures are accepted and implemented, are people going to use them as an argument to exclude Deferreds from the stdlib (or be swayed by other people making such arguments)? Hopefully not, given what I read on stdlib-sig, but it doesn't hurt to check...
Jean-Paul
Generally speaking; I don't see futures as an exclusion to Deferreds, or other asynchronous doodads. I just see it as a useful construct on top of threads and processes primarily. So in my mind, no. jesse

After playing with the API for a while & running into many issues with the examples & tests crashing windows I decided to modify the API a little and fix up the examples so they dont crash windows based computers. http://code.google.com/p/pythonfutures/issues/detail?id=1 API Change that changes the current Executor to ExecutorBase and adds a new Executor class that is used like futures.Executor() # creates an executor that uses threading and a max_workers = to the number of cpus futures.Executor(use='process') # Creates an executor that uses multiprocessing and a max_workers = to the number of cpus futures.Executor(max_workers=5) # threading again, just specifying the number of workers futures.Executor(use='process', max_workers=5) # back to multiprocessing, but with the max_workers specified

On Sat, Mar 6, 2010 at 10:09 PM, Dj Gilcrease <digitalxero@gmail.com> wrote:
After playing with the API for a while & running into many issues with the examples & tests crashing windows I decided to modify the API a little and fix up the examples so they dont crash windows based computers.
http://code.google.com/p/pythonfutures/issues/detail?id=1
API Change that changes the current Executor to ExecutorBase and adds a new Executor class that is used like
futures.Executor() # creates an executor that uses threading and a max_workers = to the number of cpus
futures.Executor(use='process') # Creates an executor that uses multiprocessing and a max_workers = to the number of cpus
futures.Executor(max_workers=5) # threading again, just specifying the number of workers
futures.Executor(use='process', max_workers=5) # back to multiprocessing, but with the max_workers specified
Making the tests and examples happy on windows is fine; but some explanation is needed for the API changes.

On Sun, Mar 7, 2010 at 6:50 AM, Jesse Noller <jnoller@gmail.com> wrote:
Making the tests and examples happy on windows is fine; but some explanation is needed for the API changes.
My primary motivation behind the API change is so there is just a single public Executor class that you tell what system to use instead of two separate classes. The use case I was thinking about is when a user is is unsure which system (threads or processes) they want to use so they just build the system with the defaults (which is threads) then it is a little easier to switch it to processes in the future instead of having to change imports and all instances of the class you just change the use keyword to switch between systems

Dj Gilcrease wrote:
On Sun, Mar 7, 2010 at 6:50 AM, Jesse Noller <jnoller@gmail.com> wrote:
Making the tests and examples happy on windows is fine; but some explanation is needed for the API changes.
My primary motivation behind the API change is so there is just a single public Executor class that you tell what system to use instead of two separate classes. The use case I was thinking about is when a user is is unsure which system (threads or processes) they want to use so they just build the system with the defaults (which is threads) then it is a little easier to switch it to processes in the future instead of having to change imports and all instances of the class you just change the use keyword to switch between systems
Wouldn't a factory function serve that purpose just as well? Or even just "from concurrent.futures import ProcessPoolExecutor as TaskExecutor". That last form has the virtue that you can retrieve your executor from anywhere rather than being limited to the two provided by the concurrent.futures model. I think the string based approach actually unduly constrains the API despite superficially appearing to make it more flexible. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

On Mon, Mar 8, 2010 at 4:25 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
Wouldn't a factory function serve that purpose just as well? Or even just "from concurrent.futures import ProcessPoolExecutor as TaskExecutor".
That last form has the virtue that you can retrieve your executor from anywhere rather than being limited to the two provided by the concurrent.futures model.
I think the string based approach actually unduly constrains the API despite superficially appearing to make it more flexible.
mm you are correct, I went with the string approach because I was experimenting with 3 additional executor types and wanted to be able to switch between or intermix them them without having to change imports and didnt feel like writing a register class with a factory method. A style I have used in my own code in the past is a Singleton class with register and create methods, where the register takes a name(string) and the class and the create method takes the name and *args, **kwargs and acts as a factory. Would this style be better or would it be better to just leave it with the two executor classes? I tend to dislike multiple classes for a what is essentially a Strategy of a concept and factories are something I tend to forget about until well after my initial idea has formed into a proof of concept.

On Mon, Mar 8, 2010 at 12:04 PM, Dj Gilcrease <digitalxero@gmail.com> wrote:
A style I have used in my own code in the past is a Singleton class with register and create methods, where the register takes a name(string) and the class and the create method takes the name and *args, **kwargs and acts as a factory.
So I decided to play with this design a little and since I made it a singleton I decided to place all the thread/process tracking and exit handle code in it instead of having the odd semi-global scoped _shutdown, _thread_references, _remove_dead_thread_references and _python_exit objects floating around in each executor file, seems to work well. The API would be from concurrent.futures import executors executor = executors.create(NAME, *args, **kwargs) # NAME is 'process' or 'thread' by default To create your own executor you create your executor class and add the following at the end from concurrent.futures import executors, ExecutorBase class MyExecutor(ExecutorBase): ... executors.register(NAME, MyExecutor) It checks to see if your executor is a subclass of ExecutorBase, but only does a UserWarning if it is not since you should know what methods are required to be an executor like object, so if you are not subclassing ExecutorBase you should suppress the UserWarning before you register you class and un-suppress it after Some Helper Methods/Properties on the executors Singleton add_joinable_ref - This replaces the _thread_references.add system of tracking threads, and it allows for adding processes as well hence the joinable_ref name. It does check to make sure the ref being passes has a join method then creates a weakref to it and adds it to a set. For every thread or process your executor creates you should call this with so it will be tracked properly cleanup_joinable_refs - This replaces the _remove_dead_thread_references that had to be written for each executor individually. This should be called periodically, currently it is only called when you create a new executor since it is a blocking method (it uses a thread lock to make sure the set of references does not change while it is discarding old ones) shutdown - is a readonly property and replaces the _shutdown global var that had to be created for each executor individually, it is set in the executors destructor __del__ - replaces the _python_exit method that had to be written for each executor individually If this API change isnt accepted its no big deal, since I am only changing it due to personal pet peeves and the only real issue I had with the package was scoping which has already been addressed by adding it to a concurrent package

On 08:56 pm, digitalxero@gmail.com wrote:
On Mon, Mar 8, 2010 at 12:04 PM, Dj Gilcrease <digitalxero@gmail.com> wrote:
A style I have used in my own code in the past is a Singleton class with register and create methods, where the register takes a name(string) and the class and the create method takes the name and *args, **kwargs and acts as a factory.
So I decided to play with this design a little and since I made it a singleton I decided to place all the thread/process tracking and exit handle code in it instead of having the odd semi-global scoped _shutdown, _thread_references, _remove_dead_thread_references and _python_exit objects floating around in each executor file, seems to work well. The API would be
from concurrent.futures import executors
executor = executors.create(NAME, *args, **kwargs) # NAME is 'process' or 'thread' by default
To create your own executor you create your executor class and add the following at the end
Getting rid of the process-global state like this simplifies testing (both testing of the executors themselves and of application code which uses them). It also eliminates the unpleasant interpreter shutdown/module globals interactions that have plagued a number of stdlib systems that keep global state. Jean-Paul

Le Mon, 08 Mar 2010 21:11:45 -0000, exarkun@twistedmatrix.com a écrit :
Getting rid of the process-global state like this simplifies testing (both testing of the executors themselves and of application code which uses them). It also eliminates the unpleasant interpreter shutdown/module globals interactions that have plagued a number of stdlib systems that keep global state.
+1.

On Mon, Mar 8, 2010 at 2:11 PM, <exarkun@twistedmatrix.com> wrote:
Getting rid of the process-global state like this simplifies testing (both testing of the executors themselves and of application code which uses them). It also eliminates the unpleasant interpreter shutdown/module globals interactions that have plagued a number of stdlib systems that keep global state.
Ok the new patch is submitted @ http://code.google.com/p/pythonfutures/issues/detail?id=1 *note there are 2 tests that fail and 1 test that dead locks on windows even without this patch, the deadlock test I am skipping in the patch and the two that fail do so for a reason that does not make sense to me.

On 10 Mar 2010, at 08:32, Dj Gilcrease wrote:
On Mon, Mar 8, 2010 at 2:11 PM, <exarkun@twistedmatrix.com> wrote:
Getting rid of the process-global state like this simplifies testing (both testing of the executors themselves and of application code which uses them). It also eliminates the unpleasant interpreter shutdown/module globals interactions that have plagued a number of stdlib systems that keep global state.
Ok the new patch is submitted @ http://code.google.com/p/pythonfutures/issues/detail?id=1
Cool, thanks.
*note there are 2 tests that fail and 1 test that dead locks on windows even without this patch, the deadlock test I am skipping in the patch and the two that fail do so for a reason that does not make sense to me.
I'll investigate but I don't have convenient access to a windows machine. Cheers, Brian

On 9 Mar 2010, at 08:11, exarkun@twistedmatrix.com wrote:
On 08:56 pm, digitalxero@gmail.com wrote:
On Mon, Mar 8, 2010 at 12:04 PM, Dj Gilcrease <digitalxero@gmail.com> wrote:
A style I have used in my own code in the past is a Singleton class with register and create methods, where the register takes a name(string) and the class and the create method takes the name and *args, **kwargs and acts as a factory.
So I decided to play with this design a little and since I made it a singleton I decided to place all the thread/process tracking and exit handle code in it instead of having the odd semi-global scoped _shutdown, _thread_references, _remove_dead_thread_references and _python_exit objects floating around in each executor file, seems to work well. The API would be
from concurrent.futures import executors
executor = executors.create(NAME, *args, **kwargs) # NAME is 'process' or 'thread' by default
To create your own executor you create your executor class and add the following at the end
Getting rid of the process-global state like this simplifies testing (both testing of the executors themselves and of application code which uses them). It also eliminates the unpleasant interpreter shutdown/module globals interactions that have plagued a number of stdlib systems that keep global state.
I'm not sure what you mean, could you clarify? Cheers, Brian

Brian Quinlan wrote:
Getting rid of the process-global state like this simplifies testing (both testing of the executors themselves and of application code which uses them). It also eliminates the unpleasant interpreter shutdown/module globals interactions that have plagued a number of stdlib systems that keep global state.
I'm not sure what you mean, could you clarify?
Assuming your question refers to the second sentence, Jean-Paul is referring to a trick of the CPython interpreter when it terminates. To maximise the chances of objects being deleted properly rather than just dumped from memory when the process exits, module dictionaries are filled with None values before the interpreter shuts down. This can cause weirdness (usually intermittent name errors during shutdown) when __del__ methods directly or indirectly reference module globals. One of the easiest ways to avoid that is to put the state on a singleton object, then give the affected classes a reference to that object. Cheers, Nick. P.S. This problem is actually the reason we don't have a context manager for temporary directories yet. Something that should have been simple became a twisty journey down the rabbit hole: http://bugs.python.org/issue5178 -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

On 10 Mar 2010, at 23:32, Nick Coghlan wrote:
Brian Quinlan wrote:
Getting rid of the process-global state like this simplifies testing (both testing of the executors themselves and of application code which uses them). It also eliminates the unpleasant interpreter shutdown/module globals interactions that have plagued a number of stdlib systems that keep global state.
I'm not sure what you mean, could you clarify?
Assuming your question refers to the second sentence, Jean-Paul is referring to a trick of the CPython interpreter when it terminates. To maximise the chances of objects being deleted properly rather than just dumped from memory when the process exits, module dictionaries are filled with None values before the interpreter shuts down.
This can cause weirdness (usually intermittent name errors during shutdown) when __del__ methods directly or indirectly reference module globals.
Ah. I'm familiar with this problem. My approach was to install an exit handler that ensures that all pending futures are complete and all threads and processes exit before allowing the interpreter to exit. Cheers, Brian
One of the easiest ways to avoid that is to put the state on a singleton object, then give the affected classes a reference to that object.
Cheers, Nick.
P.S. This problem is actually the reason we don't have a context manager for temporary directories yet. Something that should have been simple became a twisty journey down the rabbit hole: http://bugs.python.org/issue5178
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia --------------------------------------------------------------- _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/brian%40sweetapp.com

Dj Gilcrease wrote:
executor = executors.create(NAME, *args, **kwargs) # NAME is 'process' or 'thread' by default
from concurrent.futures import executors, ExecutorBase class MyExecutor(ExecutorBase): ... executors.register(NAME, MyExecutor)
I don't understand the reason for using a registration system rather than just importing names from a module. You mentioned wanting to globally change the executor class being used by a program without having to make changes throughout. Registering a different class under the same name would be one way to do that, but you could achieve the same thing just by assigning to a name in a module. In other words, instead of inventing your own mechanism for managing a namespace, just use a module as your namespace. -- Greg

At 01:10 PM 3/7/2010 +1100, Brian Quinlan wrote:
On 7 Mar 2010, at 03:04, Phillip J. Eby wrote:
At 05:32 AM 3/6/2010, Brian Quinlan wrote:
Using twisted (or any other asynchronous I/O framework) forces you to rewrite your I/O code. Futures do not.
Twisted's "Deferred" API has nothing to do with I/O.
I see, you just mean the API and not the underlying model.
We discussed the Deferred API on the stdlib-sig and I don't think that anyone expressed a preference for it over the one described in the PEP.
Do you have any concrete criticism?
Of the PEP, yes, absolutely, and I've already stated much of it. My quibbles are with the PEP *itself*, not so much the API or implementation. I think that said API and implementation is fine, but FAR too narrowly scoped to claim to be "futures" or "execute computations asynchronously", as the PEP calls it. It's really just a nice task queuing system. Now, if the PEP were *scoped* as such, i.e., "hey, let's just have a nice multithread/multiprocess task queuing implementation in the stdlib", I would be +1. It's a handy utility to have. But I think that the scope given by the PEP appears overly ambitious compared to what is actually being delivered; this seems less of a "futures API" and more like a couple of utility functions for waiting on threads and processes. To rise to the level of an API, it seems to me that it would need to address interop with coroutines and async frameworks, where the idea of "futures" seems much more relevant than simple synchronous-but-parallel scripts. (It should also have better tools for working with futures asynchronously, because, hey, it says right there in the title, "execute computations asynchronously".) Anyway, I'd like to see the answers to (at *least*) the following issues fleshed out in the PEP, if you want it to really be a "futures API", vs. "nice task queue in the stdlib": * Compare/contrast alternatives now available * Address the issue of competing event loops and sharing/passing executors among code * Either offer a way for executed code to re-enter its own executor (e.g. via an optional parameter), or explain why this was considered and rejected * Address interoperability with coroutines and async frameworks, or clearly explain why such is out of scope (Personally, I think it would be better to just drop the ambitious title and scope, and go for the "nice task queue" scope. I imagine, too, that in that case Jean-Paul wouldn't need to worry about it being raised as a future objection to Deferreds or some such getting into the stdlib.)

P.J. Eby wrote:
(Personally, I think it would be better to just drop the ambitious title and scope, and go for the "nice task queue" scope. I imagine, too, that in that case Jean-Paul wouldn't need to worry about it being raised as a future objection to Deferreds or some such getting into the stdlib.)
This may be a terminology thing - to me futures *are* just a nice way to handle farming tasks out to worker threads or processes. You seem to see them as something more comprehensive than that. I agree the PEP should just target what the current implementation provides and put whatever scope limitations are needed in the preamble text to make that clear. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia ---------------------------------------------------------------

At 02:49 PM 3/7/2010 +1000, Nick Coghlan wrote:
P.J. Eby wrote:
(Personally, I think it would be better to just drop the ambitious title and scope, and go for the "nice task queue" scope. I imagine, too, that in that case Jean-Paul wouldn't need to worry about it being raised as a future objection to Deferreds or some such getting into the stdlib.)
This may be a terminology thing - to me futures *are* just a nice way to handle farming tasks out to worker threads or processes. You seem to see them as something more comprehensive than that.
Actual futures are, yes. Specifically, futures are a mechanism for asynchronous computation, whereas the PEP seems to be all about synchronously managing parallel tasks. That's a huge difference. Technically, the things in the PEP (and by extension, Java's futures) match the letter of the definition of a future, but not (IMO) the spirit. There's no clean way to compose them, and at base they're more about parallelism than asynchrony.
I agree the PEP should just target what the current implementation provides and put whatever scope limitations are needed in the preamble text to make that clear.
Yep. I'm just saying "parallel task queueing" is a much better description of what the implementation is/does, and would suggest renaming Future -> Task and Executor -> WorkerPool or some such. These names would be *much* clearer to people who've never heard of futures, as well as more appropriate to the actual scope of what this does.

On Sun, Mar 7, 2010 at 7:48 AM, P.J. Eby <pje@telecommunity.com> wrote:
At 02:49 PM 3/7/2010 +1000, Nick Coghlan wrote:
P.J. Eby wrote:
(Personally, I think it would be better to just drop the ambitious title and scope, and go for the "nice task queue" scope. I imagine, too, that in that case Jean-Paul wouldn't need to worry about it being raised as a future objection to Deferreds or some such getting into the stdlib.)
This may be a terminology thing - to me futures *are* just a nice way to handle farming tasks out to worker threads or processes. You seem to see them as something more comprehensive than that.
Actual futures are, yes. Specifically, futures are a mechanism for asynchronous computation, whereas the PEP seems to be all about synchronously managing parallel tasks. That's a huge difference.
Technically, the things in the PEP (and by extension, Java's futures) match the letter of the definition of a future, but not (IMO) the spirit. There's no clean way to compose them, and at base they're more about parallelism than asynchrony.
Do you have an example of a language or library that uses the term "future" to refer to what you're talking about? I'm curious to see what it looks like.

On Sun, 07 Mar 2010 10:48:09 -0500, "P.J. Eby" <pje@telecommunity.com> wrote:
At 02:49 PM 3/7/2010 +1000, Nick Coghlan wrote:
I agree the PEP should just target what the current implementation provides and put whatever scope limitations are needed in the preamble text to make that clear.
Yep. I'm just saying "parallel task queueing" is a much better description of what the implementation is/does, and would suggest renaming Future -> Task and Executor -> WorkerPool or some such. These names would be *much* clearer to people who've never heard of futures, as well as more appropriate to the actual scope of what this does.
For what it's worth: I don't have any particular knowledge in this area. I did loosely follow the stdlib-sig discussion. I wasn't really sure exactly what the module was about or what a 'future' was, or why I would want to use one. I did get that it was about parallel execution of tasks, but it seemed like there had to be more to it than that. Hearing it called a 'worker pool' makes a lightbulb go off and I can now understand why this would be a useful facility to have in the standard library. -- R. David Murray www.bitdance.com
participants (12)
-
Antoine Pitrou
-
Brian Quinlan
-
Daniel Stutzbach
-
Dj Gilcrease
-
exarkun@twistedmatrix.com
-
Greg Ewing
-
Jeffrey Yasskin
-
Jesse Noller
-
Nick Coghlan
-
P.J. Eby
-
Phillip J. Eby
-
R. David Murray