You have correctly identified the summary of my intentions, and I agree with your reasoning & concern - however there is a somewhat reasonable answer as to why this optimization has never been implemented:

In Pool, the `task` tuple consists of (result_job, func, (x,), {}) .  This is the object that is serialized/deserialized b/t processes.  The only thing we really care about here is the tuple `(x,)`, confusingly, not `func` (func is ACTUALLY either mapstar() or starmapstar(), which is called with (x,) as its *args). Our element of interest is `(x,)` - a tuple of (func, iterable). Because we need to temper the size of the `iterable` bundled in each task, to avoid de/serialization slowness, we usually end up with multiple tasks per worker, and thus multiple `func`s per worker. Thus, this is really only an optimization in the case of really big functions/closures/partials (or REALLY big iterables with an unreasonably small chunksize passed to map()). The most common use case comes up when passing instance methods (of really big objects!) to Pool.map().

This post may color in the above with more details.

Further, let me pivot on my idea of __qualname__...we can use the `id` of `func` as the cache key to address your concern, and store this `id` on the `task` tuple (i.e. an integer in-lieu of the `func` previously stored there). 


On Thu, Oct 18, 2018 at 12:49 AM Michael Selik <michael.selik@gmail.com> wrote:
If imap_unordered is currently re-pickling and sending func each time it's called on the worker, I have to suspect there was some reason to do that and not cache it after the first call. Rather than assuming that's an opportunity for an optimization, I'd want to be certain it won't have edge case negative effects.


On Tue, Oct 16, 2018 at 2:53 PM Sean Harrington <seanharr11@gmail.com> wrote:
Is your concern something like the following?

with Pool(8) as p:
    gen = p.imap_unordered(func, ls)
    first_elem = next(gen)
    p.apply_async(long_func, x)
    remaining_elems = [elem for elem in gen]

My concern was passing the same function (or a function with the same qualname). You're suggesting caching functions and identifying them by qualname to avoid re-pickling a large stateful object that's shoved into the function's defaults or closure. Is that a correct summary?

If so, how would the function cache distinguish between two functions with the same name? Would it need to examine the defaults and closure as well? If so, that means it's pickling the second one anyway, so there's no efficiency gain.

In [1]: def foo(a):
   ...:     def bar():
   ...:         print(a)
   ...:     return bar
In [2]: f = foo(1)
In [3]: g = foo(2)
In [4]: f
Out[4]: <function __main__.foo.<locals>.bar()>
In [5]: g
Out[5]: <function __main__.foo.<locals>.bar()>

If we say pool.apply_async(f) and pool.apply_async(g), would you want the latter one to avoid serialization, letting the worker make a second call with the first function object?