Processing module inclusion into the stdlib proposal

I have started work on a PEP for the inclusion of the PyProcessing module (http://pypi.python.org/pypi/processing/ and http://developer.berlios.de/projects/pyprocessing) for inclusion into the stdlib in an upcoming release.
The pyprocessing module "mostly mimicks" the threading module API to provide a "drop in" process-based approach to concurrency allowing applications within python to utilize multiple cores. For example:
from threading import Thread class threads_object(Thread): def run(self): function_to_run()
becomes:
from processing import Process class process_object(Process): def run(self): function_to_run()
Currently, the module runs on Unix/Linux/OSX and Windows. It supports the following features:
* Objects can be transferred between processes using pipes or multi-producer/multi-consumer queues.
* Objects can be shared between processes using a server process or (for simple data) shared memory.
* Equivalents of all the synchronization primitives in ``threading`` are available.
* A ``Pool`` class makes it easy to submit tasks to a pool of worker processes.
In addition to local concurrency ala "just like threading" model - the processing module allows users to also share data and processes across a cluster of machines via the server Managers and Proxy objects (secured data transfer is supported, data is transferred via pickles). I believe that prior/during inclusion, additional tests will be desired to enhance coverage - but the tests which are already provided test and showcase the module well.
I have spoken to the writer, Richard Oudkerk about his willingness to maintain the processing module according to normal stdlib requirements and he is more than willing to do so.
I believe inclusion into the standard library will be very beneficial for a lot of people looking to larger-scale applications and for a method to side-step the current threading/GIL implementation. This module easily allows users to exploit their "$N core" machines, in a fashion they are already familiar with.
I would suggest that the module is placed at the top-level next to the threading module, however, there is the thought that both this module, and the threading module should be moved to a concurrent.* namespace (i.e: concurrent.threading, concurrent.processing) to allow for additional library inclusion at a later date.
IMHO: I think this is simply a "first step" in the evolution of the python stdlib to support these sorts of things - but I also believe it is an excellent first step.
Please feel free to ask any questions, comments, etc. The more feedback the better the PEP will be!
-jesse

Le mardi 18 mars 2008 à 11:44 -0400, Jesse Noller a écrit :
I have started work on a PEP for the inclusion of the PyProcessing module (http://pypi.python.org/pypi/processing/ and http://developer.berlios.de/projects/pyprocessing) for inclusion into the stdlib in an upcoming release.
I've never used the processing module, but I think including a high-level cross-platform process-based concurrency mechanism to the stdlib is a great idea. The API also seems very nice to me.
The only small remark I can make just by reading the doc is that "Manager" is a very poor class name - couldn't it be something more descriptive ? :-)
regards
Antoine.
participants (2)
-
Antoine Pitrou
-
Jesse Noller