<div dir="ltr"><div dir="ltr"><div dir="ltr"><div><div>If imap_unordered is currently re-pickling and sending func each time it's called on the worker, I have to suspect there was some reason to do that and not cache it after the first call. Rather than assuming that's an opportunity for an optimization, I'd want to be certain it won't have edge case negative effects.</div><br class="gmail-Apple-interchange-newline"></div><div><br></div><div class="gmail_quote"><div dir="ltr">On Tue, Oct 16, 2018 at 2:53 PM Sean Harrington <<a href="mailto:seanharr11@gmail.com" target="_blank">seanharr11@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div>Is your concern something like the following?</div><div><br></div>with Pool(8) as p:<div>    gen = p.imap_unordered(func, ls)</div><div>    first_elem = next(gen)</div><div>    p.apply_async(long_func, x)</div><div>    remaining_elems = [elem for elem in gen]</div></div></blockquote><div><br></div><div>My concern was passing the same function (or a function with the same qualname). You're suggesting caching functions and identifying them by qualname to avoid re-pickling a large stateful object that's shoved into the function's defaults or closure. Is that a correct summary?</div><div><br></div><div>If so, how would the function cache distinguish between two functions with the same name? Would it need to examine the defaults and closure as well? If so, that means it's pickling the second one anyway, so there's no efficiency gain.</div><div><br></div><div><div><span style="white-space:pre">      </span>In [1]: def foo(a):</div><div><span style="white-space:pre">   </span>   ...:     def bar():</div><div><span style="white-space:pre">   </span>   ...:         print(a)</div><div><span style="white-space:pre">       </span>   ...:     return bar</div><div><span style="white-space:pre">   </span>In [2]: f = foo(1)</div><div><span style="white-space:pre">    </span>In [3]: g = foo(2)</div><div><span style="white-space:pre">    </span>In [4]: f</div><div><span style="white-space:pre">     </span>Out[4]: <function __main__.foo.<locals>.bar()></div><div><span style="white-space:pre">    </span>In [5]: g</div><div><span style="white-space:pre">     </span>Out[5]: <function __main__.foo.<locals>.bar()></div></div><div><br></div><div>If we say pool.apply_async(f) and pool.apply_async(g), would you want the latter one to avoid serialization, letting the worker make a second call with the first function object?</div></div></div></div></div>