Multiprocessing, shared memory vs. pickled copies

Mon Apr 4 16:20:16 EDT 2011

Hi folks,

I'm developing some custom neural network code.  I'm using Python 2.6,
Numpy 1.5, and Ubuntu Linux 10.10.  I have an AMD 1090T six-core CPU,
and I want to take full advantage of it.  I love to hear my CPU fan
running, and watch my results come back faster.

When I'm training a neural network, I pass two numpy.ndarray objects
to a function called evaluate.  One array contains the weights for the
neural network, and the other array contains the input data.  The
evaluate function returns an array of output data.

I have been playing with multiprocessing for a while now, and I have
some familiarity with Pool.  Apparently, arguments passed to a Pool
subprocess must be able to be pickled.  Pickling is still a pretty
vague progress to me, but I can see that you have to write custom
__reduce__ and __setstate__ methods for your objects.  An example of
code which creates a pickle-friendly ndarray subclass is here:

http://www.mail-archive.com/numpy-discussion@scipy.org/msg02446.html

Now, I don't know that I actually HAVE to pass my neural network and
input data as copies -- they're both READ-ONLY objects for the
duration of an evaluate function (which can go on for quite a while).
So, I have also started to investigate shared-memory approaches.  I
don't know how a shared-memory object is referenced by a subprocess
yet, but presumably you pass a reference to the object, rather than
the whole object.   Also, it appears that subprocesses also acquire a
temporary lock over a shared memory object, and thus one process may
well spend time waiting for another (individual CPU caches may
sidestep this problem?) Anyway, an implementation of a shared-memory
ndarray is here:

https://bitbucket.org/cleemesser/numpy-sharedmem/src/3fa526d11578/shmarray.py

I've added a few lines to this code which allows subclassing the
shared memory array, which I need (because my neural net objects are
more than just the array, they also contain meta-data).  But I've run
into some trouble doing the actual sharing part.  The shmarray class
CANNOT be pickled.  I think that my understanding of multiprocessing
needs to evolve beyond the use of Pool, but I'm not sure yet.  This
post suggests as much.

http://mail.scipy.org/pipermail/scipy-user/2009-February/019696.html

I don't believe that my questions are specific to numpy, which is why
I'm posting here, in a more general Python forum.

When should one pickle and copy?  When to implement an object in
shared memory?  Why is pickling apparently such a non-trivial process
anyway?  And, given that multi-core CPU's are apparently here to stay,
should it be so difficult to make use of them?