[Pythonmac-SIG] Is this a reasonable way to do multiprocessing?

Lou Pecora lou_boog2000 at yahoo.com
Tue Apr 19 19:11:18 CEST 2011


I have a calculation which runs via a method of a class.  It is trivially 
parallelizable.  The method is just called over and over with different 
arguments. No communication, synchronization, or cooperation is needed between 
the function calls. But it is highly computational and I want to speed it all 
up.  I discovered the multiprocessing module and the use of Pool.  So my thought 
was to do something like this,

import multiprocessing as MP
class mainclass(object):
def aMethod(x):
# do stuff with x
return result
myclass=mainclass()
pool= MP.Pool(processes= 3)  # 3 CPU workers, for example
theargs= [x1,x2,x3]
resultlist= pool.map(myclass.aMethod, thargs)

But as many of you probably know that raises a pickle error.  I did a little 
research and don't fully understand, but I thought of a way around this that is 
pedestrian, but it works on my toy example. I just make (deep) copies of my 
object and let a helper function call mainclass'  aMethod. So I change the code 
(after my class definition) to,

import copy
def helperFunc(a):
return a[0].aMethod(a[1])
myclass=mainclass()
pool= MP.Pool(processes= 3)  # 3 CPU workers
theargs= [[copy.deepcopy(myclass),x1],[copy.deepcopy(myclass),x2], \
[copy.deepcopy(myclass),x3]]
resultlist= pool.map(helperFunc, thargs)

My question (subject of this post):  Is this reasonable?  Not whether it's the 
best or most efficient, but does it go most of the way in this situation to give 
me much more speedup?

Further question, on an 8 CPU MacBook Pro I get a 4X speedup (with 8 CPU 
workers), not 8X (Activity Monitor shows all 8 CPUs are utilized ~100%).  Is 
there something about dual core processors that I should know about? (since 4 = 
8/2).  

Thanks for any pointers or info. -- Lou Pecora,   my views are my own.



More information about the Pythonmac-SIG mailing list