global interpreter lock not working as it should

Fri Aug 2 11:05:38 EDT 2002

anton wilson <anton.wilson at camotion.com> wrote in message 
[ clip ..]
> 
> Maybe I am not being clear enough. I am concerned with a multi-threaded 
> program that does not do any form of blocking on a Linux/Unix box. I DO 
> expect a thread to block on the GIL every 10 byte codes. However, I have 
> proved with my results that this does NOT happen. Any thread that is 
> completely CPU bound will never give up the CPU for as long as 
> 1) it can run
> 2) it has work to do
> 
> I have proven this by even examining what happens within the interpreter 
> with 
this code in ceval:
> 
>                                 oldt = tstate;    /*<---- My code*/
> 
>                                 if (PyThreadState Swap(NULL) != tstate)
>                                   Py FatalError("ceval: tstate mix-
up");
>                                 PyThread release lock(interpreter lock);
> 
>                                 /* Other threads may run now */
>                                 /*sched yield();*/

I see your point. The release_lock / acquire_lock make only sense if
the threads have different priorities. But all Python threads have by
default the same priority ... so it makes absolutely sense to include
the sched_yield for a fair scheduling.

Without the sched_yield ... a Python thread can catch the CPU for
ever. (if not preempted by a task with a higher prio)

Well .. the threads don't care about the comment 
         /* other threads may run now */
in eval.c  :-)

Armin

> 
>                                 PyThread acquire lock(interpreter lock, 1
> );
> 
>                                 if (PyThreadState Swap(tstate) != NULL)
>                                         Py FatalError("ceval: orphan tsta
> te");
> 
>                                 if(tstate == oldt)    /*< ------- my 
> code*/  
>                                    printf("bad things have happened\n");
> 
> 
> The great majority of the time, my print statement will be printed, meani
> ng 
> the GIL was not released.
> 
>  but the put-it-up-for-grabs-every-10-instructions functionality
> > works just fine too. Consider:
> >
> > import threading, time
> >
> > COUNT = 3
> > counters = [0] * COUNT
> >
> > def Worker(i):
> >     while 1:
> >         counters[i] += 1
> >
> > for i in range(COUNT):
> >     threading.Thread(target=Worker, args=(i,)).start()
> >
> > while 1:
> >     time.sleep(1.0)
> >     print counters
> >
> > Here's some output:
> > [162565, 176016, 165796]
> > [329009, 327856, 333183]
> > [497881, 496857, 498133]
> > [665567, 679094, 643678]
> > [810255, 845521, 811988]
> > [968056, 1008142, 974790]
> >
> > Lo and behold, each thread is getting execution time, and nearly equal
> > execution time at that!
> 
> 
> There are several reasons why your program seems to work.
> The first obvious reason is that the main thread sleeps. If you remove th
> e 
> sleep, you will see output that looks like this
> 
> [0, 0, 0]
> [0, 0, 0]
> [0, 0, 0]
> [0, 0, 0]
> [0, 0, 0]
> [0, 0, 0]
> [0, 0, 0]
> [0, 0, 0]
> [0, 0, 0]
> 
> ....(100+ times in all)...
> 
> [35499, 16419, 0]
> [35499, 16419, 0]
> [35499, 16419, 0]
> [35499, 16419, 0]
> [35499, 16419, 0]
> [35499, 16419, 0]
> [35499, 16419, 0]
> [35499, 16419, 0]
> [35499, 16419, 0]
> 
> 
> ...(100+ times in all) ....
> 
> [35499, 16419, 11556]
> [35499, 16419, 11556]
> [35499, 16419, 11556]
> [35499, 16419, 11556]
> [35499, 16419, 11556]
> [35499, 16419, 11556]
> [35499, 16419, 11556]
> 
> 
> .....etc ....
> 
> 
> This proves that the GIL does not block very often, and definitely not ev
> ery 
> 10 byte codes. Think about this for a while.
> 
> I made a python interpreter with a sched yield inbetween the acquire and 
> release calls and my results looked like this with your sleep removed:
> 
> [1886, 1887, 0]
> [1886, 1887, 0]
> [1886, 1888, 0]
> [1886, 1889, 0]
> [1886, 1890, 0]
> [1886, 1890, 0]
> [1886, 1890, 0]
> [1886, 1891, 0]
> [1886, 1892, 0]
> [1886, 1893, 0]
> [1886, 1893, 0]
> 
> .....etc.......
> 
> Here, you will notice that there are constant changes. The GIL releasing 
> is 
> working as intended.
> 
> 
> This brings me to the second reason that your program seems to work.
> The Linux OS gives threads time-slices and when these time-slices are use
> d up 
> every 150 or so milliseconds, the process is forcibly removed from the CP
> U.
> I presume that the reason your program seems to work is that in the time 
> between when a thread releases the GIL and a thread tries to reaquire the
>  
> GIL, it is forcibly removed from the CPU, and the other thread can now ru
> n. 
> This would not be a rare occurence due to the high frequency at which the
>  
> lock is released.
> 
> To prove this, I ran the program using sched rr threads and changed the 
> kernel so that round robin threads had no timeslice. In this case I saw t
> his 
> output once per second:
> 
> [0, 0, 0]
> [0, 586126, 0]
> [0, 1194859, 0]
> [0, 1802596, 0]
> [0, 2414027, 0]
> 
> Because the thread is never forced by the OS to relinquish the CPU, the 
> thread will never ever lose the GIL. If the GIL was actually working core
> ctly 
> and blocking, the second thread would not retain the CPU past 10 byte cod
> es.
> 
> 
> So, the GIL does not blcok as intended, and this probably needs to be loo
> ked 
> into.
> 
> Anton