@Nataniel this is what I am suggesting as well. No cacheing - just storing the `fn` on each worker, rather than pickling it for each item in our iterable.
As long as we store the `fn` post-fork on the worker process (perhaps as global), subsequent calls to Pool.map shouldn't be effected (referencing Antoine's & Michael's points that "multiprocessing encapsulates each subprocesses globals in a separate namespace").
@Antoine - I'm making an effort to take everything you've said into consideration here. My initial PR
and talk was intended to shed light on a couple of pitfalls that I often see Python end-users encounter with Pool. Moving beyond my naive first attempt, and the onslaught of deserved criticism, it seems that we have an opportunity here: No changes to the interface, just an optimization to reduce the frequency of pickling.
Raymond Hettinger may also be interested in this optimization, as he speaks (with great analogies) about
different ways you can misuse concurrency in Python. This would address one of the pitfalls that he outlines: the "size of the serialized/deserialized data".
Is this an optimization that either of you would be willing to review, and accept, if I find there is a *reasonable way* to implement it?