[stdlib-sig] Processing module inclusion into the stdlib proposal
Jesse Noller
jnoller at gmail.com
Tue Mar 18 16:44:10 CET 2008
I have started work on a PEP for the inclusion of the PyProcessing
module (http://pypi.python.org/pypi/processing/ and
http://developer.berlios.de/projects/pyprocessing) for inclusion into
the stdlib in an upcoming release.
The pyprocessing module "mostly mimicks" the threading module API to
provide a "drop in" process-based approach to concurrency allowing
applications within python to utilize multiple cores. For example:
from threading import Thread
class threads_object(Thread):
def run(self):
function_to_run()
becomes:
from processing import Process
class process_object(Process):
def run(self):
function_to_run()
Currently, the module runs on Unix/Linux/OSX and Windows. It supports
the following features:
* Objects can be transferred between processes using pipes or
multi-producer/multi-consumer queues.
* Objects can be shared between processes using a server process or
(for simple data) shared memory.
* Equivalents of all the synchronization primitives in ``threading``
are available.
* A ``Pool`` class makes it easy to submit tasks to a pool of worker
processes.
In addition to local concurrency ala "just like threading" model - the
processing module allows users to also share data and processes across
a cluster of machines via the server Managers and Proxy objects
(secured data transfer is supported, data is transferred via pickles).
I believe that prior/during inclusion, additional tests will be
desired to enhance coverage - but the tests which are already provided
test and showcase the module well.
I have spoken to the writer, Richard Oudkerk about his willingness to
maintain the processing module according to normal stdlib requirements
and he is more than willing to do so.
I believe inclusion into the standard library will be very beneficial
for a lot of people looking to larger-scale applications and for a
method to side-step the current threading/GIL implementation. This
module easily allows users to exploit their "$N core" machines, in a
fashion they are already familiar with.
I would suggest that the module is placed at the top-level next to the
threading module, however, there is the thought that both this module,
and the threading module should be moved to a concurrent.* namespace
(i.e: concurrent.threading, concurrent.processing) to allow for
additional library inclusion at a later date.
IMHO: I think this is simply a "first step" in the evolution of the
python stdlib to support these sorts of things - but I also believe it
is an excellent first step.
Please feel free to ask any questions, comments, etc. The more
feedback the better the PEP will be!
-jesse
More information about the stdlib-sig
mailing list