
Hi there. Occasionally you want to do something on a worker thread from Python. Some operating systems, like Windows, maintain a threadpool for which you can queue a user work item to execute. I was thinking that it could be a good idea to expose such a functionality by the thread module. With something like: def thread.queue_call(callable, args, kwargs): #invoke the OS threadpool api and have callable called Then in threading.py: Try: From thread import queue_call Except ImportError: Def queue_call(callable, args, kwargs): #default implementation Thread(target=callable, args=args, kwargs=kwargs).start() Having operating system support for this would free us from creating a new real thread every time such a pattern is invoked. It would also improve the latency of such a call since a thread creation isn't free. K

Ah, now that you mention it, I stumbled across this last PyCon :) Looks nice, particularly if we can have classes such as ThreadPoolExecutor rely on OS support to do their stuff. As such, it might be good not to make any promises about max_workers and so on. Well, on windows one can have the OS create private thread pools to your specification, but I have found it useful to just rely on the default one... Anyway, these were just idle thoughts. Let's carry on... K -----Original Message----- From: gvanrossum@gmail.com [mailto:gvanrossum@gmail.com] On Behalf Of Guido van Rossum Sent: Wednesday, November 10, 2010 12:08 To: Kristján Valur Jónsson Cc: python-ideas@python.org Subject: Re: [Python-ideas] pool threads Have you looked at PEP 3148?

On 11/9/2010 11:07 PM, Guido van Rossum wrote:
Have you looked at PEP 3148?
I have been rewriting a web interface for a cluster of servers that I wanted to do batch queries in parallel and the first that came to mind was to try out this futures module (I'm using the version from pypi w/Python 2.6). I had some difficulty getting started using the module due to a lack of examples of all the different ways of using pools (in particular, there is no example showing something like what Kristján wants -- a set-it-and-forget-it pool). However, once I got past the learning curve, I made some code that does what I want. Despite the code working, it wasn't as fast as I thought it should be, so I started profiling the code, and I noticed some craziness. Looking closer at it, I hardly believe anyone else has used this for anything but toy examples. For clarity, the code I am profiling is basically this: executor = futures.ThreadPoolExecutor(len(servers)) futures_list = [] for address in servers: def call(address=address): client = None load = None try: client = Client(address) load = client.load() finally: if client: client.close() return (address, load) futures_list.append(executor.submit(call)) c.servers = [] for future in futures.as_completed(futures_list): result = future.result() if result: address, load = result c.servers.append({'server': address, 'load': load}) Running this yields an insane number of wait()s on an Event() object. .../futures/_base.py:149 as_completed x36 2673.00ms .../threading.py:391 wait x421111 1720.10ms Looking closer at as_completed(), the call to wait is wrong: waiter.event.wait(timeout) should be: waiter.event.wait(wait_timeout) However, that isn't my problem. More importantly, the Event() itself represents the wrong synchronization primitive: waiter = _create_and_install_waiters(fs, FIRST_COMPLETED) Creates an Event() that is set() when the first future in fs completes, which then spins as_completed() as the event will stay set forever. Obviously, I will file a bug report (w/patch) about this, but I post my experience here because it's pretty much exemplifies why I believe even small libraries should see use in the wild before being tossed into the stdlib -- I'm glad 3.2 is only alpha. Might be nice if the release page mentioned this module to encourage more use of it. -- Scott Dial scott@scottdial.com scodial@cs.indiana.edu

Ah, now that you mention it, I stumbled across this last PyCon :) Looks nice, particularly if we can have classes such as ThreadPoolExecutor rely on OS support to do their stuff. As such, it might be good not to make any promises about max_workers and so on. Well, on windows one can have the OS create private thread pools to your specification, but I have found it useful to just rely on the default one... Anyway, these were just idle thoughts. Let's carry on... K -----Original Message----- From: gvanrossum@gmail.com [mailto:gvanrossum@gmail.com] On Behalf Of Guido van Rossum Sent: Wednesday, November 10, 2010 12:08 To: Kristján Valur Jónsson Cc: python-ideas@python.org Subject: Re: [Python-ideas] pool threads Have you looked at PEP 3148?

On 11/9/2010 11:07 PM, Guido van Rossum wrote:
Have you looked at PEP 3148?
I have been rewriting a web interface for a cluster of servers that I wanted to do batch queries in parallel and the first that came to mind was to try out this futures module (I'm using the version from pypi w/Python 2.6). I had some difficulty getting started using the module due to a lack of examples of all the different ways of using pools (in particular, there is no example showing something like what Kristján wants -- a set-it-and-forget-it pool). However, once I got past the learning curve, I made some code that does what I want. Despite the code working, it wasn't as fast as I thought it should be, so I started profiling the code, and I noticed some craziness. Looking closer at it, I hardly believe anyone else has used this for anything but toy examples. For clarity, the code I am profiling is basically this: executor = futures.ThreadPoolExecutor(len(servers)) futures_list = [] for address in servers: def call(address=address): client = None load = None try: client = Client(address) load = client.load() finally: if client: client.close() return (address, load) futures_list.append(executor.submit(call)) c.servers = [] for future in futures.as_completed(futures_list): result = future.result() if result: address, load = result c.servers.append({'server': address, 'load': load}) Running this yields an insane number of wait()s on an Event() object. .../futures/_base.py:149 as_completed x36 2673.00ms .../threading.py:391 wait x421111 1720.10ms Looking closer at as_completed(), the call to wait is wrong: waiter.event.wait(timeout) should be: waiter.event.wait(wait_timeout) However, that isn't my problem. More importantly, the Event() itself represents the wrong synchronization primitive: waiter = _create_and_install_waiters(fs, FIRST_COMPLETED) Creates an Event() that is set() when the first future in fs completes, which then spins as_completed() as the event will stay set forever. Obviously, I will file a bug report (w/patch) about this, but I post my experience here because it's pretty much exemplifies why I believe even small libraries should see use in the wild before being tossed into the stdlib -- I'm glad 3.2 is only alpha. Might be nice if the release page mentioned this module to encourage more use of it. -- Scott Dial scott@scottdial.com scodial@cs.indiana.edu
participants (5)
-
Antoine Pitrou
-
Guido van Rossum
-
Jason Orendorff
-
Kristján Valur Jónsson
-
Scott Dial