global interpreter lock not working as it should

Fri Aug 2 08:23:09 EDT 2002

On 2/8/2002 10:55, in article
ddc19db7.0208020155.637f446e at posting.google.com, "Armin Steinhoff"
<a-steinhoff at web.de> wrote:

> No ... when I follow up the discussion, it seems to be obviously that
> most Python experts are NOT operating system experts!

Even if this were true, it would hardly matter. We aren't discussing the OS,
we're discussing thread behaviour within Python.

[I'd warn you, though, that you're treading on dangerously thin ice.]

> No ... _Python_ in common _isn't_ threadsafe.
> 
> The GIL is used to create a critical section for the byte code
> processing of the interpreter, because every interpreter thread
> creates its own interpreter instance. These instances are working
> concurrently.  (The critical section takes 10 byte codes ... even if
> there are critical actions or not, but this number is adjustable and
> should be minimized if we talk about real-time behavior)
> 
> The documentation is a little bit confusing regarding blocking I/Os.
> It makes no sense to use e.g. time.sleep() in a critical section
> spawned by the GIL ... this will kill the real-time behavior of the
> other interpreter threads.

To quote from the Python source of 'time.sleep()':

[...]
        t.tv_usec = (long)(frac*1000000.0);
        Py_BEGIN_ALLOW_THREADS
        if (select(0, (fd_set *)0, (fd_set *)0, (fd_set *)0, &t) != 0) {
#ifdef EINTR
                if (errno != EINTR) {
#else
                if (1) {
#endif
                        Py_BLOCK_THREADS
                        PyErr_SetFromErrno(PyExc_IOError);
                        return -1;
                }
        }
        Py_END_ALLOW_THREADS
#elif defined(macintosh)
[...]

The thing to note is the Py_*_THREADS macros which release and then reobtain
the GIL around the sleep (actually using select for subsecond sleep). All
threadsafe blocking calls in the interpreter are similarly bracketed.

The important thing in Python multi-threading is that you do plenty of
nothing. Python guarantees an upper bound on the number of bytecodes before
the threading system has the opportunity to reschedule, but these have
unbounded execution cost. So your latency is pretty much an unknown if the
interpreter is on the CPU when you need to respond to I/O.

Hence the standard response that if you have serious CPU work to do, you
should do it in a C extension with the GIL released (and hope that the
underlying threading system has some kind of realtime behaviour).

However, except for non-threadsafe system(/library) calls, the interpreter
will *not* block while holding the GIL.

>> so the GIL isn't released.  That doesn't mean that
>> each Python thread isn't an OS-level thread.
> 
> This doesn't say nothing ...
> 
> Oh yes ... I have to correct me: Python threads _are_ system level
> threads!!
> (Every new interpreter thread does its own evaluation of a Python
> object by PyEval_CallObjectWith....)

It said exactly what it said (and confusingly you appear to have
acknowledged the point): Python threads are system level threads. Indeed if
Python threads weren't system level threads then the GIL would be
unnecessary, as the interpreter would then have complete control over
context switching. As it doesn't, and is at the mercy of the threading
system, the GIL is required to protect the non-threadsafe interpreter
objects.

The only alternative to a big lock is a lot of little ones. This improves
latency, but reduces overall performance - it's a well known tradeoff. Since
Python is mostly not used in realtime systems, the correct decision has been
made.

Stackless Python (as I think someone else mentioned) is probably a much
better solution for highly threaded soft-realtime. Amusingly, Stackless is
probably a better solution precisely because it's context switching
*doesn't* use OS threads ;-)

Jonathan