Multiprocessing, shared memory vs. pickled copies

Sun Apr 10 11:01:41 EDT 2011

On 9 apr, 22:18, John Ladasky <lada... at my-deja.com> wrote:

> So, there are limited advantages to trying to parallelize the
> evaluation of ONE cascade network's weights against ONE input vector.
> However, evaluating several copies of one cascade network's output,
> against several different test inputs simultaneously, should scale up
> nicely.  

You get an 1 by n Jacobian for the errors, for which a parallel vector
math library could help. You will also need to train the input layer
(errors with an m by n Jacobian), with QuickProp or LM, to which an
optimized LAPACK can give you a parallel QR or SVD. So all the
inherent parallelism in this case can be solved by libraries.

> Well, I thought that NUMPY was that fast library...

NumPy is a convinient container (data buffer object), and often an
acceptaby fast library (comparable to Matlab). But comparted to raw C
or Fortran it can be slow (NumPy is memory bound, not compute bound),
and compared to many 'performance libraires' NumPy code will run like
turtle.

> My single-CPU neural net training program had two threads, one for the
> GUI and one for the neural network computations.  Correct me if I'm
> wrong here, but -- since the two threads share a single Python
> interpreter, this means that only a single CPU is used, right?  I'm
> looking at multiprocessing for this reason.

Only access to the Python interpreter i serialised. It is a common
misunderstanding that everything is serialized.

Two important points:

1. The functions you call can spawn threads in C or Fortran. These
threads can run feely even though one of them is holding the GIL. This
is e.g. what happens when you use OpenMP with Fortran or call a
performance library. This is also how Twisted can implement
asynchronous i/o on Windows using i/o completion ports (Windows will
set up a pool of background threads that run feely.)

2. The GIL can be released while calling into C or Fortran, and
reacquired afterwards, so multiple Python threads can execute in
parallel. If you call a function you know to be re-entrant, you can
release the GIL when calling it. This is also how Python threads can
be used to implement non-blocking i/o with functions like file.read
(they release the GIL before blocking on i/o).

The extent to which these two situations can happen depends on the
libraries you use, and how you decide to call them.

Also note that NumPy/SciPy can be compiled againt different BLAS/
LAPACK libraries, some of which will use multithreading internally.
Note that NumPy and SciPy does not release the GIL as often as it
could, which is why I often prefer to use libraries like LAPACK
directly.

In your setup you should relese the GIL around computations to prevent
the GUI from freezing.

Sturla