[Python-Dev] "Fixing" the new GIL

Peter Portante peter.a.portante at gmail.com
Mon Apr 12 14:48:25 CEST 2010


And why the for(;;) loop in bfs_schedule()? I don¹t see a code path that
would loop there. Perhaps I am missing it ...

-peter


On 4/12/10 8:37 AM, "Peter Portante" <peter.a.portante at gmail.com> wrote:

> Hmm, so I see in bfs_yield():
> 
> +    if (tstate != NULL && bfs_thread_switch == tstate) {
> +        COND_RESET(tstate->bfs_cond);
> +        COND_WAIT(tstate->bfs_cond, bfs_mutex);
> +    }
> 
> So, to me, the above code says, ³Wait for the condition that tstate is either
> NULL, or bfs_thread_switch does not equal tstate². So the predicate is:
> ³(tstate != NULL && bfs_thread_switch == tstate)².
> 
> If the predicate is True before you call COND_WAIT() and True after you call
> COND_WAIT(), either you don¹t need to call COND_WAIT() at all, or you need to
> loop until the predicate is False. There is no guarantee that a condition wait
> actually did anything at all. Yes, there will be spurious wake ups, but you
> can¹t tell if the thread actually blocked and then woke up, or never blocked
> at all. If it never actually blocks, then that code path is not helpful.
> 
> On Windows, before this loop in bfs_schedule():
> 
> +        COND_RESET(tstate->bfs_cond);
> +        while (bfs_thread != tstate) {
> +            _bfs_timed_wait(tstate, timestamp);
> +            timestamp = get_timestamp();
> +        }
> 
> You might want to avoid the call to reset the condition variable if the
> predicate is already False.
> 
> -peter
> 
> 
> On 4/12/10 8:12 AM, "Nir Aides" <nir at winpdb.org> wrote:
> 
>> Hi Peter,
>> 
>> There is no need for a loop in bfs_yield(). 
>> 
>> 
>> On Mon, Apr 12, 2010 at 4:26 AM, Peter Portante <peter.a.portante at gmail.com>
>> wrote:
>>> Nir,
>>> 
>>> Per the POSIX standard, both pthread_cond_wait() and
>>> pthread_cond_timedwait() need to be performed in a loop.  See the fourth
>>> paragraph of the description from:
>>> 
>>>> http://www.opengroup.org/onlinepubs/000095399/functions/pthread_cond_timedw
>>>> ait.html 
>>>> <http://www.opengroup.org/onlinepubs/000095399/functions/pthread_cond_timed
>>>> wait.html> 
>>> 
>>> For the Windows side, I think you have a similar problem. Condition
>>> variables are signaling mechanisms, and so they have a separate boolean
>>> predicate associated with them. If you release the mutex that protects the
>>> predicate, then after you reacquire the mutex, you have to reevaluate the
>>> predicate to ensure that the condition has actually been met.
>>> 
>>> You might want to look at the following for a discussion (not sure how good
>>> it is, as I just google¹d it quickly) of how to implement POSIX semantics on
>>> Windows:
>>> 
>>>> http://www.cs.wustl.edu/~schmidt/win32-cv-1.html
>>>> <http://www.cs.wustl.edu/~schmidt/win32-cv-1.html>
>>> 
>>> Before you can evaluate the effectiveness of any of the proposed scheduling
>>> schemes, the fundamental uses of mutexes and condition variables, and their
>>> implementations, must be sound.
>>> 
>>> -peter
>>> 
>>> 
>>> 
>>> On 4/11/10 6:50 PM, "Nir Aides" < nir at winpdb.org <http://nir@winpdb.org> >
>>> wrote:
>>> 
>>>> Hello all,
>>>> 
>>>> I would like to kick this discussion back to life with a simplified
>>>> implementation of the BFS scheduler, designed by the Linux kernel hacker
>>>> Con Kolivas: http://ck.kolivas.org/patches/bfs/sched-BFS.txt
>>>> <http://ck.kolivas.org/patches/bfs/sched-BFS.txt>
>>>> 
>>>> I submitted bfs.patch at  http://bugs.python.org/issue7946
>>>> <http://bugs.python.org/issue7946> . It is work in progress but is ready
>>>> for some opinion.
>>>> 
>>>> On my machine BFS gives comparable performance to gilinter, and seems to
>>>> schedule threads more fairly, predictably, and with lower rate of context
>>>> switching. Its basic design is very simple but nevertheless it was designed
>>>> by an expert in this field, two characteristics which combine to make it
>>>> attractive to this case.
>>>> 
>>>> The problem addressed by the GIL has always been *scheduling* threads to
>>>> the interpreter, not just controlling access to it, and therefore the GIL,
>>>> a lock implemented as a simple semaphore was the wrong solution.
>>>> 
>>>> The patches by Antoine and David essentially evolve the GIL into a
>>>> scheduler, however both cause thread starvation or high rate of context
>>>> switching in some scenarios:
>>>> 
>>>> With Floren't write test ( http://bugs.python.org/issue7946#msg101120
>>>> <http://bugs.python.org/issue7946#msg101120> ):
>>>> 2 bg threads, 2 cores set to performance, karmic, PyCon patch, context
>>>> switching shoots up to 200K/s.
>>>> 2 bg threads, 1 core, set to on-demand, karmic, idle machine, gilinter
>>>> patch starves one of the bg threads.
>>>> 4 bg threads, 4x1 core xeon, centos 5.3, gilinter patch, all bg threads
>>>> starved, context switching shoots up to 250K/s.
>>>> 
>>>> With UDP test ( http://bugs.python.org/file16316/udp-iotest.py
>>>> <http://bugs.python.org/file16316/udp-iotest.py> ), add
>>>> zlib.compress(b'GIL') to the workload:
>>>> both gilinter and PyCon patches starve the IO thread.
>>>> 
>>>> The BFS patch currently involves more overhead by reading the time stamp on
>>>> each yield and schedule operations. In addition it still remains to address
>>>> some issues related to timestamps such as getting different time stamp
>>>> readings on different cores on some (older) multi-core systems.
>>>> 
>>>> Any thoughts?
>>>> 
>>>> Nir
>>>> 
>>>> 
>>>> 
>>>> On Sun, Mar 14, 2010 at 12:46 AM, Antoine Pitrou < solipsis at pitrou.net
>>>> <http://solipsis@pitrou.net> > wrote:
>>>>> 
>>>>> Hello,
>>>>> 
>>>>> As some of you may know, Dave Beazley recently exhibited a situation
>>>>> where the new GIL shows quite a poor behaviour (the old GIL isn't very
>>>>> good either, but still a little better). This issue is followed in
>>>>> http://bugs.python.org/issue7946 <http://bugs.python.org/issue7946>
>>>>> 
>>>>> This situation is when an IO-bound thread wants to process a lot of
>>>>> incoming packets, while one (or several) CPU-bound thread is also
>>>>> running. Each time the IO-bound thread releases the GIL, the CPU-bound
>>>>> thread gets it and keeps holding it for at least 5 milliseconds
>>>>> (default setting), which limits the number of individual packets which
>>>>> can be recv()'ed and processed per second.
>>>>> 
>>>>> I have proposed two mechanisms, based on the same idea: IO-bound
>>>>> threads should be able to steal the GIL very quickly, rather than
>>>>> having to wait for the whole "thread switching interval" (again, 5 ms
>>>>> by default). They differ in how they detect an "IO-bound threads":
>>>>> 
>>>>> - the first mechanism is actually the same mechanism which was
>>>>>   embodied in the original new GIL patch before being removed. In this
>>>>>   approach, IO methods (such as socket.read() in socketmodule.c)
>>>>>   releasing the GIL must use a separate C macro when trying to get the
>>>>>   GIL back again.
>>>>> 
>>>>> - the second mechanism dynamically computes the "interactiveness" of a
>>>>>   thread and allows interactive threads to steal the GIL quickly. In
>>>>>   this approach, IO methods don't have to be modified at all.
>>>>> 
>>>>> Both approaches show similar benchmark results (for the benchmarks
>>>>> that I know of) and basically fix the issue put forward by Dave Beazley.
>>>>> 
>>>>> Any thoughts?
>>>>> 
>>>>> Regards
>>>>> 
>>>>> Antoine.
>>>>> 
>>>>> 
>>>>> ______________________________ _________________
>>>>> Python-Dev mailing list
>>>>> Python-Dev at python.org <http://Python-Dev@python.org>
>>>>> http://mail.python.org/mailman/listinfo/python-dev
>>>>> <http://mail.python.org/mailman/listinfo/python-dev>
>>>>> Unsubscribe: 
>>>>> http://mail.python.org/mailman/options/python-dev/nir%40winpdb.org
>>>>> <http://mail.python.org/mailman/options/python-dev/nir%40winpdb.org>
>>>> 
>>>> 
>>>> 
>>>> _______________________________________________
>>>> Python-Dev mailing list
>>>> Python-Dev at python.org <http://Python-Dev@python.org>
>>>> http://mail.python.org/mailman/listinfo/python-dev
>>>> <http://mail.python.org/mailman/listinfo/python-dev>
>>>> Unsubscribe: 
>>>> http://mail.python.org/mailman/options/python-dev/peter.a.portante%40gmail.
>>>> com 
>>>> <http://mail.python.org/mailman/options/python-dev/peter.a.portante%40gmail
>>>> .com> 
>> 
>> 
>> 
>> _______________________________________________
>> Python-Dev mailing list
>> Python-Dev at python.org
>> http://mail.python.org/mailman/listinfo/python-dev
>> Unsubscribe: 
>> 
http://mail.python.org/mailman/options/python-dev/peter.a.portante%40gmail.co>>
m

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100412/827d2204/attachment-0001.html>


More information about the Python-Dev mailing list