[Python-Dev] "Fixing" the new GIL

Nir Aides nir at winpdb.org
Mon Apr 12 14:12:37 CEST 2010


Hi Peter,

There is no need for a loop in bfs_yield().


On Mon, Apr 12, 2010 at 4:26 AM, Peter Portante
<peter.a.portante at gmail.com>wrote:

>  Nir,
>
> Per the POSIX standard, both pthread_cond_wait() and
> pthread_cond_timedwait() need to be performed in a loop.  See the fourth
> paragraph of the description from:
>
>
> http://www.opengroup.org/onlinepubs/000095399/functions/pthread_cond_timedwait.html
>
>
> For the Windows side, I think you have a similar problem. Condition
> variables are signaling mechanisms, and so they have a separate boolean
> predicate associated with them. If you release the mutex that protects the
> predicate, then after you reacquire the mutex, you have to reevaluate the
> predicate to ensure that the condition has actually been met.
>
> You might want to look at the following for a discussion (not sure how good
> it is, as I just google’d it quickly) of how to implement POSIX semantics on
> Windows:
>
> http://www.cs.wustl.edu/~schmidt/win32-cv-1.html
>
>
> Before you can evaluate the effectiveness of any of the proposed scheduling
> schemes, the fundamental uses of mutexes and condition variables, and their
> implementations, must be sound.
>
> -peter
>
>
>
> On 4/11/10 6:50 PM, "Nir Aides" < nir at winpdb.org> wrote:
>
> Hello all,
>
> I would like to kick this discussion back to life with a simplified
> implementation of the BFS scheduler, designed by the Linux kernel hacker Con
> Kolivas: http://ck.kolivas.org/patches/bfs/sched-BFS.txt
>
> I submitted bfs.patch at  http://bugs.python.org/issue7946. It is work in
> progress but is ready for some opinion.
>
> On my machine BFS gives comparable performance to gilinter, and seems to
> schedule threads more fairly, predictably, and with lower rate of context
> switching. Its basic design is very simple but nevertheless it was designed
> by an expert in this field, two characteristics which combine to make it
> attractive to this case.
>
> The problem addressed by the GIL has always been *scheduling* threads to
> the interpreter, not just controlling access to it, and therefore the GIL, a
> lock implemented as a simple semaphore was the wrong solution.
>
> The patches by Antoine and David essentially evolve the GIL into a
> scheduler, however both cause thread starvation or high rate of context
> switching in some scenarios:
>
> With Floren't write test ( http://bugs.python.org/issue7946#msg101120):
> 2 bg threads, 2 cores set to performance, karmic, PyCon patch, context
> switching shoots up to 200K/s.
> 2 bg threads, 1 core, set to on-demand, karmic, idle machine, gilinter
> patch starves one of the bg threads.
> 4 bg threads, 4x1 core xeon, centos 5.3, gilinter patch, all bg threads
> starved, context switching shoots up to 250K/s.
>
> With UDP test ( http://bugs.python.org/file16316/udp-iotest.py), add
> zlib.compress(b'GIL') to the workload:
> both gilinter and PyCon patches starve the IO thread.
>
> The BFS patch currently involves more overhead by reading the time stamp on
> each yield and schedule operations. In addition it still remains to address
> some issues related to timestamps such as getting different time stamp
> readings on different cores on some (older) multi-core systems.
>
> Any thoughts?
>
> Nir
>
>
>
> On Sun, Mar 14, 2010 at 12:46 AM, Antoine Pitrou < solipsis at pitrou.net>
> wrote:
>
>
> Hello,
>
> As some of you may know, Dave Beazley recently exhibited a situation
> where the new GIL shows quite a poor behaviour (the old GIL isn't very
> good either, but still a little better). This issue is followed in
> http://bugs.python.org/issue7946
>
> This situation is when an IO-bound thread wants to process a lot of
> incoming packets, while one (or several) CPU-bound thread is also
> running. Each time the IO-bound thread releases the GIL, the CPU-bound
> thread gets it and keeps holding it for at least 5 milliseconds
> (default setting), which limits the number of individual packets which
> can be recv()'ed and processed per second.
>
> I have proposed two mechanisms, based on the same idea: IO-bound
> threads should be able to steal the GIL very quickly, rather than
> having to wait for the whole "thread switching interval" (again, 5 ms
> by default). They differ in how they detect an "IO-bound threads":
>
> - the first mechanism is actually the same mechanism which was
>   embodied in the original new GIL patch before being removed. In this
>   approach, IO methods (such as socket.read() in socketmodule.c)
>   releasing the GIL must use a separate C macro when trying to get the
>   GIL back again.
>
> - the second mechanism dynamically computes the "interactiveness" of a
>   thread and allows interactive threads to steal the GIL quickly. In
>   this approach, IO methods don't have to be modified at all.
>
> Both approaches show similar benchmark results (for the benchmarks
> that I know of) and basically fix the issue put forward by Dave Beazley.
>
> Any thoughts?
>
> Regards
>
> Antoine.
>
>
> ______________________________ _________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> http://mail.python.org/mailman/options/python-dev/nir%40winpdb.org
>
>
>
> ------------------------------
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> http://mail.python.org/mailman/options/python-dev/peter.a.portante%40gmail.com
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100412/ead11d19/attachment.html>


More information about the Python-Dev mailing list