global interpreter lock not working as it should

Tue Jul 30 19:18:11 EDT 2002

On Tue, 30 Jul 2002, anton wilson wrote:

> Maybe I am not being clear enough. I am concerned with a multi-threaded 
> program that does not do any form of blocking on a Linux/Unix box.

Quick tangential question: if there's no blocking of any kind, why are you
using threads, anyway? Off the cuff thinking says this seems like a misuse
of them.

> I have proven this by even examining what happens within the interpreter with 
> this code in ceval:
> 
> 
>                                 oldt = tstate;    /*<---- My code*/
> 
>                                 if (PyThreadState_Swap(NULL) != tstate)
>                                         Py_FatalError("ceval: tstate mix-up");
> 
> 
>                                 PyThread_release_lock(interpreter_lock);
> 
>                                 /* Other threads may run now */
>                                 /*sched_yield();*/
> 
>                                 PyThread_acquire_lock(interpreter_lock, 1);

Note that control does not necessarily return to this thread at this 
point! An attempt is made to acquire the lock - sometimes the same thread 
will get it, sometimes not, and whether or not it does is entirely up to 
the platform's thread library.

> The great majority of the time, my print statement will be printed, meaning 
> the GIL was not released.

No! This "proves" that the GIL was released and _the same thread was
allowed to reacquire it immediately_. The difference between my statement 
and yours is where your confusion lies.

>  but the put-it-up-for-grabs-every-10-instructions functionality
> > works just fine too. Consider:

[snip my program that shows lack of thread starvation]

> There are several reasons why your program seems to work.
> The first obvious reason is that the main thread sleeps. If you remove the 
> sleep, you will see output that looks like this
> 
> [0, 0, 0]
> 
> ....(100+ times in all)...
> 
> [35499, 16419, 0]
> 
> ...(100+ times in all) ....
> 
> [35499, 16419, 11556]
> 
> This proves that the GIL does not block very often, and definitely not every 
> 10 byte codes. Think about this for a while.

<sigh> No, it doesn't "prove" anything about the GIL. Without the print
statements, the main thread is spinning like crazy - the other threads
probably aren't even getting _started_ right away. There are (at least)  
two different mechanisms at work here: the GIL-releasing code and your
OS's (or thread library's) thread scheduler. Even though the GIL gets
released, your OS/thread library may choose NOT to switch thread contexts. 
IOW, your main thread still may have lots of time left in its timeslice, 
and so the thread library is saying, "hey, keep running!". 

Note that if you try the above experiment in C (start 3 threads and 
immediately have main thread spin and print out wildly) you'll get very 
similar output (lots of [0,0,0]'s printed at the beginning, etc). In fact, 
I'd really recommend that you do this, because once you're familiar with 
how it all works in C then there's really no surprises with how it works 
in Python - IMO you've confused yourself by zeroing in on the GIL when in 
reality it's quite transparent.

If anything, the program output is "proving" that your threading library
is behaving efficiently - because there is no reason to force a context
switch (e.g. blocking I/O), it is being sensible and avoiding the
expensive switch by letting each thread use as much of its timeslice as
possible. During each timeslice the GIL is probably being released a
reacquired oodles of times, and this is correct behavior.

> So, the GIL does not blcok as intended, and this probably needs to be looked 
> into.

Just curious: how do you explain all the multithreaded Python programs 
that currently work just fine? Are these all flukes?

-Dave