Multiprocessing Pool and functions with many arguments
Piet van Oostrum
piet at cs.uu.nl
Fri May 1 12:46:18 EDT 2009
>>>>> "psaffrey at googlemail.com" <psaffrey at googlemail.com> (P) wrote:
>P> I'm trying to get to grips with the multiprocessing module, having
>P> only used ParallelPython before.
>P> based on this example:
>P> http://docs.python.org/library/multiprocessing.html#using-a-pool-of-workers
>P> what happens if I want my "f" to take more than one argument? I want
>P> to have a list of tuples of arguments and have these correspond the
>P> arguments in f, but it keeps complaining that I only have one argument
>P> (the tuple). Do I have to pass in a tuple and break it up inside f? I
>P> can't use multiple input lists, as I would with regular map.
You give the tuple of the arguments for the function:
def f(a, b, c):
return a + b * c
pool = Pool(processes=4) # start 4 worker processes
result = pool.apply(f, (2, 3, 4)) # evaluate "f(2, 3, 4)"
Or if you have a list:
args = [ (2, 3, 4), # arguments for call 1
(5, 6, 7) # arguments for call 2
]
print [pool.apply(f, a) for a in args]
However, as each call to apply wait for its results, this will execute
sequentially instead of parallel.
You can't use map directly as it works only with single argument functions.
>>> print pool.map(f, args)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/multiprocessing/pool.py", line 148, in map
return self.map_async(func, iterable, chunksize).get()
File "/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/multiprocessing/pool.py", line 422, in get
raise self._value
TypeError: f() takes exactly 3 arguments (1 given)
Is that what you mean?
But you can use a wrapper function:
def wrapf(abc):
return f(*abc)
[later...]
print pool.map(wrapf, args)
This is covered in the examples section of multiprocessing (see
calculate and calculatestar for example).
Or you can use apply_async and later wait for the results:
results = [pool.apply_async(f, a) for a in args]
print [r.get() for r in results]
Now the calls to f are done in parallel, which you can check by putting
a sleep inside f.
--
Piet van Oostrum <piet at cs.uu.nl>
URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
Private email: piet at vanoostrum.org
More information about the Python-list
mailing list