[Python-Dev] bpo-34837: Multiprocessing.Pool API Extension - Pass Data to Workers w/o Globals

Antoine Pitrou solipsis at pitrou.net
Fri Oct 12 08:48:47 EDT 2018

On Fri, 12 Oct 2018 08:33:32 -0400
Sean Harrington <seanharr11 at gmail.com> wrote:
> Hi Nathaniel - this if this solution can be made performant, than I would
> be more than satisfied.
> I think this would require removing "func" from the "task tuple", and
> storing the "func" "once per worker" somewhere globally (maybe a class
> attribute set post-fork?).
> This also has the beneficial outcome of increasing general performance of
> Pool.map and friends. I've seen MANY folks across the interwebs doing
> things like passing instance methods to map, resulting in "big" tasks, and
> slower-than-sequential parallelized code. Parallelizing "instance methods"
> by passing them to map, w/o needing to wrangle with staticmethods and
> globals, would be a GREAT feature! It'd just be as easy as:
>     Pool.map(self.func, ls)
> What do you think about this idea? This is something I'd be able to take
> on, assuming I get a few core dev blessings...

Well, I'm not sure how it would work, so it's difficult to give an
opinion.  How do you plan to avoid passing "self"?  By caching (by
equality? by identity?)?  Something else?  But what happens if "self"
changed value (in the case of a mutable object) in the parent?  Do you
keep using the stale version in the child?  That would break



More information about the Python-Dev mailing list