Python threading (was: Re: global interpreter lock not working as it should)

Sun Aug 4 09:16:12 EDT 2002

On 4/8/2002 11:14, in article m31y9fj5hp.fsf at mira.informatik.hu-berlin.de,
"Martin v. Loewis" <martin at v.loewis.de> wrote:

> Jonathan Hogg <jonathan at onegoodidea.com> writes:
> 
>> I'm sorry, but that's just plain rubbish. If all the threads have the same
>> priority then the thread scheduler will schedule them according to
>> timeslice. I'm really not sure where you get the idea that a thread can only
>> be pre-empted by a higher priority thread. That's just not true.
> 
> I think it is true. The chance that the time slice expires within the
> few instructions when the GIL is released is so low that you'ld
> practically never see a thread switch - yet you do, on many systems.

I've been thinking about this some more and I think I'm ready to eat humble
pie ;-)

Ignoring SCHED_FIFO and SCHED_OTHER which clearly rely on - either static or
dynamic - differing priorities to function, the interesting case is
SCHED_RR. I sat down and thought about this a bit more and I think I figured
out why it might not work with same-priority threads.

Let's say we have two pthreads, A and B, with equal priority, both
CPU-bound, with A on the CPU and B waiting on the GIL-condition. Thread A
executes bytecodes until it is forced to release and reacquire the GIL. In
doing so it signals the GIL-condition and unblocks B. This invokes the
scheduler, but A continues execution because it is still within it's
timeslice. When the timeslice expires the timer interrupt fires invoking the
scheduler. The scheduler pre-empts A and starts B running. B returns from
waiting on the GIL-condition and attempts to set the GIL-locked variable.
Unfortunately the GIL is still locked by A so it goes back into the
condition wait and blocks. This will re-invoke the scheduler which will
switch back to A with a *new* timeslice allocation. GOTO 10.

I had it in my head that, when the timer interrupt for the timeslice fires,
thread B wouldn't be runnable because it's waiting on the GIL. But of course
it *is* runnable because it's not waiting on the GIL, it's waiting on a
condition saying that the GIL has been released sometime in the past. If it
was waiting on the GIL then the timer interrupt would do nothing and when A
hit the next release-reacquire point the scheduler would see that A had gone
past the end of it's timeslice and could switch to the now-runnable B.

I'm not sure if this is what is happening or not, but it seems plausible. In
which case the problem is, in some respect, that B has been made runnable
before the timeslice for A has expired but can't actually do any work as it
will immediately block again. The thread scheduler has no way of knowing
that B hasn't been given a fair go and so presumes that B has decided to
block voluntarily and that A can get back to work.

Inserting an explicit yield between the release and re-acquisition of the
lock would be a terrible idea as it would defeat the scheduler and hammer
performance. My thought would be that the trick is to not release the GIL
while the running thread still has timeslice. Then when you do release the
GIL, the scheduler would (hopefully) immediately switch to the next waiting
thread.

God knows I'm no pthreads expert, but I know you can query the scheduling
policy of the current thread. So if it's SCHED_RR, can one ask how much
timeslice is left? If you could do that then you could just skip over the
release-reacquire.

Hmmm...

Oh! Apologies to Armin for being overly dismissive before.

Jonathan