
On Sat, Jan 23, 2016 at 7:22 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
[...] For a practical example of this, consider the ThreadPoolExecutor example from the concurrent.futures docs: https://docs.python.org/3/library/concurrent.futures.html#threadpoolexecutor...
A scope-per-iteration construct makes it much easier to use a closure to define the operation submitted to the executor for each URL:
with concurrent.futures.ThreadPoolExecutor(max_workers=5) as executor: # Start the load operations and mark each future with its URL future_to_site = {} for new site_url in sites_to_load: def load_site(): with urllib.request.urlopen(site_url, timeout=60) as conn: return conn.read() future_to_site[executor.submit(load_site)] = site_url # Report results as they become available for future in concurrent.futures.as_completed(future_to_site): site_url = future_to_site[future] try: data = future.result() except Exception as exc: print('%r generated an exception: %s' % (site_url, exc)) else: print('%r page is %d bytes' % (site_url, len(data)))
If you try to write that code that way today (i.e. without the "new" on the first for loop), you'll end up with a race condition between the main thread changing the value of "site_url" and the executor issuing the URL open request.
I wonder if kids today aren't too much in love with local function definitions. :-) There's a reason why executor.submit() takes a function *and arguments*. If you move the function out of the for loop and pass the url as a parameter to submit(), problem solved, and you waste fewer resources on function objects and cells to hold nonlocals. A generation ago most people would have naturally used such a solution (since most languages didn't support the alternative :-). -- --Guido van Rossum (python.org/~guido)