[Python-ideas] Composability and concurrent.futures

Matt Joiner anacrolix at gmail.com
Mon May 21 18:17:06 CEST 2012


On Thu, May 17, 2012 at 4:43 AM, Adrian Sampson
<asampson at cs.washington.edu>wrote:

> The concurrent.futures module in the Python standard library has problems
> with composability. If I start a ThreadPoolExecutor to run some library
> functions that internally use ThreadPoolExecutor, I will end up with many
> more worker threads on my system than I expect. For example, each parallel
> execution wants to take full advantage of an 8-core machine, I could end up
> with as many as 8*8=64 competing worker threads, which could significantly
> hurt performance.
>
> This is because each instance of ThreadPoolExecutor (or
> ProcessPoolExecutor) maintains its own independent worker pool. Especially
> in situations where the goal is to exploit multiple CPUs, it's essential
> for any thread pool implementation to globally manage contention between
> multiple concurrent job schedulers.
>
> I'm not sure about the best way to address this problem, but here's one
> proposal: Add additional executors to the futures library.
> ComposableThreadPoolExecutor and ComposableProcessPoolExecutor would each
> use a *shared* thread-pool model. When created, these composable executors
> will check to see if they are being created within a future worker
> thread/process initiated by another composable executor. If so, the "child"
> executor will forward all submitted jobs to the executor in the parent
> thread/process. Otherwise, it will behave normally, starting up its own
> worker pool.
>
> Has anyone else dealt with composition problems in parallel programs? What
> do you think of this solution -- is there a better way to tackle this
> deficiency?


It's my understanding this is a known flaw with concurrency *in general*.
Currently most multi-{threaded,process} applications assume they're the
only ones running on the system. As does the likely implementation of the
proposed composable pools problem you've posed. A proper interprocess
scheduler is required to handle this ideally. (See GCD, and runtime
implementations that provide at least some userspace scheduling such as Go,
however poor it may be).

Secondly, composable pools don't handle recursive relationships well. If a
thread in one pool depends on the completion of all the tasks in its own
pool to complete before it can itself complete, you'll have deadlock.

Personally if I implemented a composable thread pool I'd have it global,
creation and submission of tasks would be proxied to it via some composable
executor class.

As it stands, thread pools are best for task-oriented concurrency rather
than parallelism anyway, especially in CPython.

In short, I think composable thread pools are a hack at best and won't gain
you anything except a slightly reduced threading overhead. If you want
optimal utilization, threading isn't the right place to be looking.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20120522/92e74039/attachment.html>


More information about the Python-ideas mailing list