Multi-threading on Multi-CPU machines

Alex Martelli aleax at aleax.it
Tue Jul 9 11:28:47 EDT 2002


anton wilson wrote:

> 
>> With a multithreaded approach you might keep the array in memory
>> and have the main thread farm out work requests to worker threads
>> via a bounded queue.  You want the queue a bit larger than the
>> number of worker threads, and you can determine the optimal size
>> for a work request (could be one item, or maybe two, or, say, 4)
>> via some benchmarking.  Upon receiving a work request from the
>> Queue, a worker thread would:
>>         -- get a local copy of the relevant points from the
>>            large array,
>>         -- enter the C-coded computation function which
>>            -- releases the GIL,
>>            -- does the computations getting the nes points,
>>            -- acquires the GIL again,
> 
> 
> If the bounded queue were declared in a C extention module, would a thread
> doing the calculations really have to reaquire the GIL everytime that
> thread accessed this C data structure? Could mutexes be used instead?

C code talking to other C code, with Python *nowhere* in the picture,
does not need the GIL but may make its own arrangements.  However, it's
hard to see how the Python data placed in the queue would get turned
into C-usable data WITHOUT using some of the Python API -- whenever ANY
use of the Python API is made, the thread making such use must hold the
GIL (of course Python can't _guarantee_ that EVERY such GIL-less use
will crash the program, burn the CPU AND raze the machine room to the
ground, unfortunately, but you should still program AS IF that was
the case).

Given that a C-coded function is called from Python, it IS holding the
GIL when it starts executing -- what it must do it to RELEASE the GIL
as soon as it's finished doing calls to the Python API in order to let
other threads use the Python interpreter, then acquire the GIL again
before it can return control to the Python that called it.  There is
no benefit that I can see in duplicating the Queue module in C with
all the attendant locking headaches &c -- moving the loop itself into
C seems to be a tiny, irrelevant speedup anyway.


Alex




More information about the Python-list mailing list