Will multithreading make python less popular?

sturlamolden sturlamolden at yahoo.no
Tue Feb 17 10:50:33 EST 2009


On 16 Feb, 10:34, rushen... at gmail.com wrote:

> And the story begins here. As i search on the net,  I have found that
> because of the natural characteristics of python such as GIL, we are
> not able to write multi threaded programs. Oooops, in a kind of time
> with lots of cpu cores and we are not able to write multi threaded
> programs.

The GIL does not prevent multithreaded programs. If it did, why does
Python have a "threading" module?

The GIL prevents one use of threads: parallel processing in plain
Python. You can still do parallel processing using processes. Just
import "multiprocessing" instead of "threading". The two modules have
fairly similar APIs. You can still use threads to run tasks in the
background.

The GIL by the way, is an implementation detail. Nobody likes it very
much. But it is used for making it easier to extend Python with C
libraries (Python's raison d'etre). Not all C libraries are thread-
safe. The GIL is also used to synchronize access to reference counts.
In fact, Ruby is finally going to get a GIL as well. So it can't be
that bad.

As for parallel processing and multicore processors:

1. Even if a Python script can only exploit one core, we are always
running more than one process on the computer. For some reason this
obvious fact has to be repeated.

2. Parallel processing implies "need for speed". We have a 200x speed
penalty form Python alone. The "need for speed" are better served by
moving computational bottlenecks to C or Fortran. And in this case,
the GIL does not prevent us from doing parallel processing. The GIL
only affects the Python portion of the code.

3. Threads are not designed to be an abstraction for parallel
processing. For this they are awkward, tedious, and error prone.
Current threading APIs were designed for asynchronous tasks. Back in
the 1990s when multithreading became popular, SMPs were luxury only
few could afford, and multicore processors were unheard of.

4. The easy way to do parallel processing is not threads but OpenMP,
MPI, or HPF. Threads are used internally by OpenMP and HPF, but those
implementation details are taken care of by the compiler. Parallel
computers have been used by scientists and engineers for three decades
now, and threads have never been found a useful abstraction for manual
coding. Unfortunately, this knowledge has not been passed on from
physicists and engineers to the majority of computer programmers.
Today, there is a whole generation of misguided computer scientists
thinking that threads must be the way to use the new multicore
processors. Take a lesson from those actually experienced with
parallel computers and learn OpenMP!

5. If you still insist on parallel processing with Python threads --
ignoring what you can do with multiprocessing and native C/Fortran
extensions -- you can still do that as well. Just compile your script
with Cython or Pyrex and release the GIL manually. The drawback is
that you cannot touch any Python objects (only C objects) in your GIL-
free blocks. But after all, the GIL is used to synchronize reference
counting, so you would have to synchronize access to the Python
objects anyway.


import threading

def _threadproc():
     with nogil:
        # we do not hold the GIL here
        pass
     # now we have got the GIL back
     return

def foobar():
    t = threading.Thread(target=_threadproc)
    t.start()
    t.join()

That's it.





Sturla Molden




More information about the Python-list mailing list