Efficient Threading

Marko Rauhamaa marko at pacujo.net
Sun Nov 16 10:16:14 CET 2014

Dan Stromberg <drsalists at gmail.com>:

> On Fri, Nov 14, 2014 at 10:42 AM, Empty Account <emptya45 at gmail.com> wrote:
>> I am thinking about writing a load test tool in Python, so I am
>> interested in how I can create the most concurrent threads/processes
>> with the fewest OS resources. I would imagine that I/O would need to
>> be non-blocking.
> If you need a large amount of concurrency, you might look at Jython
> with threads.  Jython threads well.
> If you don't intend to do more than a few hundred concurrent things,
> you might just go with CPython and multiprocessing.

It is very rare to need a "large amount of concurrency." Most hardware
doesn't even support it. The number of CPU cores (plus hyperthreads)
poses a physical limit that cannot be exceeded. Also, the I/O throughput
will almost certainly be more limiting than the CPU.

What I'm getting at is that it is generally not a good idea to represent
a large number of simultaneous operations with an equal number of
threads or processes. While the simplicity of that idea is enticing, it
often leads to an expensive refactoring years down the road (been there,
done that).

Instead, address the true concurrency needs with a group/pool of
processes or threads, and represent your simultaneous contexts with
objects that you map onto the processes or threads. If your application
does not involve obnoxious, blocking library calls (eg, database
access), you might achieve top throughput with a single process (no

Java had to reinvent their whole stdlib I/O paradigm to address the
scalability problems of the naive zillion-thread approach (NIO). Python
is undergoing a similar transformation (asyncio), although it has always
provided low-level facilities for "doing the right thing."

To summarize, this is how I implement these kinds of applications:

 1. If all I/O is nonblocking (and linux's blocking file I/O doesn't get
    in the way), I implement the application single-threaded. In Python,
    I use select.epoll(EPOLLET) with callbacks. Python's new asyncio
    framework is a portable, funky way to implement the same idea.

 2. If I must deal with blocking I/O calls, I set up a pool of
    processes. The size of the pool is calculated from several factors:
    the number of CPU cores, network latencies and the server

 3. I generally much prefer processes over threads because
    they provide for better fault-tolerance.


More information about the Python-list mailing list