Scheduling used in the multiprocessing.Pool.map() function
rpg.314 at gmail.com
Sat Oct 31 16:12:02 CET 2009
I have been using the map() function in the multiprocessing module to
parallelize my tasks on a dual core CPU. My tasks are embarrassingly
parallel, shared nothing tasks. In one of my runs, I found that the
this function interleaves execution of two processes over a single
So far so good. But the problem is that the last remnant job is
executed serially. I mean that it seems that the job scheduling is
essentially static, and the last piece does not execute in parallel.
Why can't there be a task-stealing scheduler in multiprocessing? Each
of my individual function call in map takes over half hour (Each
function call internally calls out to c++ code). This could be a very
useful addition to multiprocessing's utility.
More information about the Python-list