[Python-3000] Is reference counting still needed?

Tim Peters tim.peters at gmail.com
Wed Apr 19 21:07:50 CEST 2006


[Greg Ewing]
>> Even if it [mark-sweepish gc] did become available, we might not
>> want to> use it. In recent times I've come round to the view that,
>> on modern architectures where cacheing is all-important,
>> refcounting + cyclic garbage collection may well be
>> *better* than mark-and-sweep or some variation thereof.

[Guido]
> Not that I disagree -- do you have specific data or reasoning to back
> this up? I'd love to hear that we were right all the time! :-)

It's always :-) been very easy to see why in simple examples, like:

    s = 0
    for i in xrange(1000000):
        s += i

Current CPython (re)uses just a few int objects, which are almost
certain to remain in  L1 cache for the duration.  Pure mark-sweep
waits "until RAM is exhausted", and then crawls over that entire giant
blob of address space at least once more to release it.

CPython puts enormous pressure on dynamic-memory throughput because
_everything_  comes from dynamic memory, even microscopically
short-lived integers and stack frames.  Many other languages don't,
and it matters.  I noted this in public <wink> in 1999, after Neal
Schemenauer tried replacing all of Python's gc with BDW mark-sweep,
and reported:

    I started with Sam Rushing's patch and modified it for Python
    1.5.2c.  To my surprise, removing the reference counting
    (Py_INCREF, Py_DECREF) actually slowed down Python a lot (over
    two times for pystone).

That thread should still be required reading for anyone thinking of
changing CPython's gc strategy:

    http://mail.python.org/pipermail/python-list/1999-July/0073.html

Note that he got much better performance by leaving refcounting in but
using BDW for cycle collection (and also note the pain required to get
BDW not to erroneously free memory allocated in Tkinter.c).

OTOH, I haven't had bandwidth in years to pay any attention to what
other language implementations are doing for gc, and working from
ignorance of non-Python practice isn't quite as crushingly convincing
as I'd like :-)


More information about the Python-3000 mailing list