[CentralOH] Fwd: Re: Stackless Python
brian.costlow at gmail.com
Mon Nov 2 15:37:03 CET 2009
Just for my own education, I did a little further digging this weekend to
get a better understanding of why the GIL prevents Python from running
across multi-core/multi-processing systems.
I had a pretty accurate, but high-level understanding about how the GIL
works. I also thought that aside from the GIL, Python's thread
implementation was pretty much: let the kernel/os deal with scheduling,
priorities, context switches etc. So although I read a brief comment on a
board from a well-known pythonista that it 'won't ' run across multi-cores,
that didn't make sense to me. If it just uses the os' underlying thread
model, why won't it run across multi-cores when you have I/O bound threads,
or when your thread is C extension code that releases the lock while it does
stuff outside the interpreter?
So before I stuck my foot in my mouth again, I did some more searching and
As it turns out, Python does try to use more than one core/processor, but
the GIL implementation can cause a terrible race condition.
A quick simplified summary.
The thread holding the GIL releases it every 100 ticks (a tick can be
thought of loosely as 1-6 python bytecode instructions, depending on the
The os gets the opportunity to context switch to another thread. However,
the scheduler usually doesn't switch at every opportunity, so the running
thread will try to reacquire the GIL. In a single processor, when the os
does context switch, the running thread is stopped, and another one is woken
up and given control.
In a multiple core/processor system, say we have thread 1 and 2 running
across different processors. Thread 1 holds the GIL and is processor-bound,
while 2 is doing some I/O. Now 2 needs access to the python interpreter
again, so it waits until it acquires the GIL. Thread one releases the GIL.
The os starts to wake up thread 2, but thread 1 is still running and also
tries to reacquire the GIL. Thread 1 usually wins, because of the overhead
of waking up thread 2. (In the links below, in one case, thread 2 attempted
to get the lock 1400 times before it was successful). There's all the extra
work that results in the slowdown.
See this great talk by David Beazley:
Accompanying slides (PDF):
Long but interesting thread on the python concurrency sig. Gets really
interesting and overlaps/references Dave's talk when the subject line
changes to "Inside the Python GIL."
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the CentralOH