Re: [SciPy-User] using multiple processors for particle filtering

28 May 2010

      ...
Thanks for the replies and pointers.  I got multiprocessing.Pool to
work, but it eats up memory and time.  I append two implementation
segments below.  The multiprocessing version is about 33 times
_slower_ than the single processor version.  Unless I use a small
number of processors, memory fills up and I kill the job to make the
computer usable again.  The following segments of code are inside a
loop that steps over 115 lines of pixels.
Several problems here:

(1) I am sorry I didn't mention this earlier, but looking over your  
original email, it appears that your single-process code might be very  
inefficient: it seems to perturb each particle individually in a for- 
loop rather than working on an array of all the particles. Perhaps you  
should try to fix that before adding multiprocessing? Basically, you  
should hopefully be able to write random_fork to work on a number of  
particles at once using numpy broadcasting, etc. This way, the for- 
loop that steps through the elements is implemented in compiled C,  
rather than interpreted python. Check out various numpy tutorials for  
details, but here's the general gist:

points = numpy.arange(6000).reshape((3000,2)) # 3000 x,y points
perturbations = numpy.random.normal(size=(3000,2))

def perturb_bad(points, perturbations):
   for point, perturbation in zip(points, perturbations):
     point += perturbation

def perturb_good(points, perturbations):
   points += perturbations

timeit perturb_bad(points, perturbations) # 10 loops, best of 3: 18.7  
milliseconds per loop
timeit perturb_good(points, perturbations) # 10000 loops, best of 3:  
161 microseconds per loop

Compare this orders-of-magnitude gain to the at-best-8-fold gain you'd  
get from multiprocessing the bad code.

Also note that "map" is basically just an interpreted for-loop under  
the hood:
import operator
timeit map(operator.add, points, perturbations) # 10 loops, best of 3:  
18.7 milliseconds per loop

The moral here is to avoid looping constructs in python when working  
with sets of numbers and instead use numpy operations that operate on  
lots of numbers with one python command.

(2) From the slowdowns you report, it looks like overhead costs are  
completely dominating. For each job, the code and data need to be  
serialized (pickled, I think, is how the multiprocessing library  
handles it), written to a pipe, unpickled, executed, and the results  
need to be pickled, sent back, and unpickled. Perhaps using memmap to  
share state might be better? Or you can make sure that the function  
parameters and results can be very rapidly pickled and unpickled  
(single numpy arrays, e.g., not lists-of-sub-arrays or something).

Still, tune the single-processor code first. Perhaps you can send more  
detailed code samples and folks on the list can offer some advice  
about how to make it numpy-friendly and fast.

Zach

On May 27, 2010, at 5:37 PM, Andy Fraser wrote:
...
Thanks for the replies and pointers.  I got multiprocessing.Pool to
work, but it eats up memory and time.  I append two implementation
segments below.  The multiprocessing version is about 33 times
_slower_ than the single processor version.  Unless I use a small
number of processors, memory fills up and I kill the job to make the
computer usable again.  The following segments of code are inside a
loop that steps over 115 lines of pixels.
def func(job):
   return job[0].random_fork(job[1])
.
.
.
.
.
.
#Multiprocessing version:
noise = numpy.random.standard_normal((N_particles,noise_df))
       jobs = zip(self.particles,noise)
       self.particles = self.pool.map(func, jobs, self.chunk_size)
       return (m,v)
.
.
.
.
.
.
#Single processing version
noise = numpy.random.standard_normal((N_particles,noise_df))
       jobs = zip(self.particles,noise)
       self.particles = map(func, jobs)
       return (m,v)
-- 
Andy Fraser				ISR-2	(MS:B244)
afraser@lanl.gov			Los Alamos National Laboratory
505 665 9448				Los Alamos, NM 87545
_______________________________________________
SciPy-User mailing list
SciPy-User@scipy.org
http://mail.scipy.org/mailman/listinfo/scipy-user

Re: [SciPy-User] using multiple processors for particle filtering

Zachary Pincus