[Python-Dev] Fwd: Removal of GIL through refcounting removal.

Jesse Noller jnoller at gmail.com
Thu Oct 30 18:02:18 CET 2008

On Thu, Oct 30, 2008 at 12:13 PM, Sigurd Torkel Meldgaard
<stm at daimi.au.dk> wrote:
> Hi to all Python developers
> For a student project in a course on virtual machines, we are
> evaluating the possibility to
> experiment with removing the GIL from CPython
> We have read the arguments against doing this at
> http://www.python.org/doc/faq/library/#can-t-we-get-rid-of-the-global-interpreter-lock.
> But we think it might be possible to do this with a different approach
> than what has been tried till now.
> The main reason for the necessity of the GIL is reference counting.
> We believe that most of the slowdown in the free threading
> implementation of Greg Stein was due to the need of atomic
> refcounting, as this mail seems to confirm:
> http://mail.python.org/pipermail/python-ideas/2007-April/000414.html
> So we want to change CPython into having a "real" garbage collector -
> removing all reference counting, and then the need for locks (or
> atomic inc/dec ops) should be
> highly alleviated.
> Preferably the GC should be a high-performance one for instance a
> generational one.
> We believe that it can run quite a lot faster than ref-counting.
> Shared datastructures would get their lock obviously.
> Immutable objects (especially shared global objects, like True, False, Null)
> would not.
> Most of the interpreter structure would be per-thread, at that point.
> We do not know how Greg Stein did his locking in the free threads
> patch, but as a part of the course we learned there exists much faster
> ways of locking than using OS-locks (faster for the uncontented case)
> that are used in e.g. the HOT-SPOT java-compiler. This might make
> "free threading" in python more attractive than some pessimists think.
> (http://blogs.sun.com/dave/entry/biased_locking_in_hotspot)
> In particular, we are talking about making the uncontended case go fast,
> not about the independent part of stack-allocating the mutex
> structure, which can only be done and is only needed in Java.
> These ideas are similar to the ones used by Linux fast mutexes
> (futexes), the implementation of mutexes in NPTL.
> We have read this mail thread - so it seems that our idea surfaced,
> but Greg didn't completely love it (he wanted to optimize refcounting
> instead):
> http://mail.python.org/pipermail/python-ideas/2007-April/000436.html
> He was not totally negative however. His main objections are about:
> - cache locality (He is in our opinion partially right, as seen in some
> other paper time ago - any GC, copying GC in particular, doubles the
> amount of used memory, so it's less cache-friendly). But still GCs are
> overall competitive or faster than explicit management, and surely
> much faster of refcounting.
> We know it is the plan for PyPy to work in this way, and also that
> Jython and Ironpython works like that (using the host vm's GC), so it
> seems to be somehow agreeable with the python semantics (perhaps not
> really with __del__ but they are not really nice anyway).
> Was this ever tried for CPython?
> Any other comments, encouragements or warnings on the project-idea?
> Best regards: Paolo, Sigurd

See also:

More information about the Python-Dev mailing list