You have correctly identified the summary of my intentions, and I agree with your reasoning & concern - however there is a somewhat reasonable answer as to why this optimization has never been implemented:
In Pool, the `task` tuple consists of (result_job, func, (x,), {}) . This is the object that is serialized/deserialized b/t processes. The only thing we really care about here is the tuple `(x,)`, confusingly, not `func` (func is ACTUALLY either mapstar() or starmapstar(), which is called with (x,) as its *args). Our element of interest is `(x,)` - a tuple of (func, iterable). Because we need to temper the size of the `iterable` bundled in each task, to avoid de/serialization slowness, we usually end up with multiple tasks per worker, and thus multiple `func`s per worker. Thus, this is really only an optimization in the case of really big functions/closures/partials (or REALLY big iterables with an unreasonably small chunksize passed to map()). The most common use case comes up when passing instance methods (of really big objects!) to Pool.map().
This post may color in the above with more details.
Further, let me pivot on my idea of __qualname__...we can use the `id` of `func` as the cache key to address your concern, and store this `id` on the `task` tuple (i.e. an integer in-lieu of the `func` previously stored there).