On Thu, May 27, 2010 at 10:37 PM, Andy Fraser <afraser@lanl.gov> wrote:
#Multiprocessing version:
noise = numpy.random.standard_normal((N_particles,noise_df)) jobs = zip(self.particles,noise) self.particles = self.pool.map(func, jobs, self.chunk_size) return (m,v)
What platform are you on? I often forget that multiprocessing works quite differently on Windows to unix platforms (and is much less useful). On unix platforms the child processes are spawned with fork(), which means they share all the memory state of the parent process, with copy on write if they make changes. On Windows seperate processes are spawned and all the state has to be past through the serialiser (I think). So on unix you can share large quantities of (read only) data very cheaply by making it accessible before the fork. So if you are on Mac/Linux and the slow down is caused by passing the large noise array, you could get around this by making it a global somehow before the fork when you initiate the pool... ie import mymodule mymodule.noise = numpy.random.standard_normal((N_particles,noise_df)) then use this in func, dont pass the noise array in the map call. But I agree with Zachary about using arrays of object parameters rather than lists of objects each with their own parameter variables. Cheers Robin