[Python-Dev] bpo-34837: Multiprocessing.Pool API Extension - Pass Data to Workers w/o Globals

Sean Harrington seanharr11 at gmail.com
Fri Oct 12 09:17:59 EDT 2018


The implementation details need to be flushed out, but agnostic of these,
do you believe this a valid solution to the initial problem? Do you also
see it as a beneficial optimization to Pool, given that we don't need to
store funcs/bound-methods/partials on the tasks themselves?

The latter concern about "what happens if `self` changed value in the
parent" is the same concern as "what happens if `func` changes in the
parent?" given the current implementation. This is an assumption that is
currently made with Pool.map_async(func, ls). If "func" changes in the
parent, there is no communication with the child. So one just needs to be
aware that calling "map_async(self.func, ls)" while the state of "self" is
changing, will not communicate changes to each worker. The state is frozen
when Pool.map is called, just as is the case now.

On Fri, Oct 12, 2018 at 9:07 AM Antoine Pitrou <solipsis at pitrou.net> wrote:

> On Fri, 12 Oct 2018 08:33:32 -0400
> Sean Harrington <seanharr11 at gmail.com> wrote:
> > Hi Nathaniel - this if this solution can be made performant, than I would
> > be more than satisfied.
> >
> > I think this would require removing "func" from the "task tuple", and
> > storing the "func" "once per worker" somewhere globally (maybe a class
> > attribute set post-fork?).
> >
> > This also has the beneficial outcome of increasing general performance of
> > Pool.map and friends. I've seen MANY folks across the interwebs doing
> > things like passing instance methods to map, resulting in "big" tasks,
> and
> > slower-than-sequential parallelized code. Parallelizing "instance
> methods"
> > by passing them to map, w/o needing to wrangle with staticmethods and
> > globals, would be a GREAT feature! It'd just be as easy as:
> >
> >     Pool.map(self.func, ls)
> >
> > What do you think about this idea? This is something I'd be able to take
> > on, assuming I get a few core dev blessings...
>
> Well, I'm not sure how it would work, so it's difficult to give an
> opinion.  How do you plan to avoid passing "self"?  By caching (by
> equality? by identity?)?  Something else?  But what happens if "self"
> changed value (in the case of a mutable object) in the parent?  Do you
> keep using the stale version in the child?  That would break
> compatibility...
>
> Regards
>
> Antoine.
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/seanharr11%40gmail.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20181012/fc0f7d8a/attachment.html>


More information about the Python-Dev mailing list