[Python-3000] Delayed reference counting idea

Adam Olsen rhamph at gmail.com
Mon Sep 18 17:48:56 CEST 2006


I think all the attempts to expose GIL-less semantics to python code
miss the point.  Reference counting turns all references into
modifications.  You can't avoid the GIL without first changing
reference counting.

There's a few ways to approach this:
* atomic INCREF/DECREF using cpu instructions.  This would be very
expensive, considering how often we do it.
* Bolt-on tracing GC such as Boehm-Demers-Weiser.  Totally unsupported
by the C standards and changes cache characteristics that CPython has
been designed with for years, likely with a very large performance
penalty.
* Tracing GC within C.  Would require rewriting every API in CPython,
as well as the code that uses them.  Alternative implementations
(PyPy, et al) can try this, but I think it's clear that it's not worth
the effort for CPython, especially given the performance risks.
* Delayed reference counting (save 10 or 20 INCREF/DECREF ops to a
buffer, then flush them all at once).  In theory, it would retain the
cache locality while amortizing locking needed for SMP machines.

For the most part delayed reference counting should require no
changes, since it would use the existing INCREF/DECREF API.  Some code
does circumvent that API, and would need to be changed.

Anyway, my point is that, for those of you out there who want to
remove the GIL, here is something you really can experiment with.
Even if there was a 20% performance drop on real-world tests you could
still make it a configure option, enabled only for people who need
many CPUs.  (I've tried it myself, but never got past the weird
crashes.  Probably missed something silly).

-- 
Adam Olsen, aka Rhamphoryncus


More information about the Python-3000 mailing list