Reference counting garbage collection

Delaney, Timothy tdelaney at
Thu Aug 23 02:32:13 CEST 2001

>     Paul> Is there any particular reason Python uses 
> reference counting
>     Paul> garbage collection (which leaks memory if there's circular
>     Paul> structure) instead of Lisp-style garbage 
> collection?  As Python

> needs dictate).  I believe the main reason Python doesn't use garbage
> collection for the whole kit-n-kaboodle is that garbage collection
> introduces non-determinism in the actual time an object gets 
> reclaimed.  A

I believe another major reason is that adding "real" GC would cause problems
for C extensions. All Python's cycle collector does is break the links
forming the cycle - the reference counting semantics then handle the rest.

Java has a "real" GC - and this often leads to problems - in particular,
programs being relatively fast until a memory threshold is reached - and
then the GC starts kicking in and things really start slowing down in a
non-deterministic way. Python never suffers from that problem - whilst a
large data structure may need to be collected all at once, you at least
*know* that that is what is going to happen, and it will happen the same way
with every run of the program (assuming the same inputs).

In tests people have done by replacing Python's reference counting by
Boehm's collector (reported to this newsgroup) there have been marked speed
advantages to reference counting in most cases. This is empirical evidence
that reference counting works well *for Python* in most cases.

There will always be tradeoffs. For example, compare the two equivalent

for i in xrange(10000):
	s = str(i) + str(i)

for (var i = 0; i < 10000; i++)
	var s = Integer.toString(i) + Integer.toString(i);

In the case of the Python program, the memory use is very low - the same
memory locations will be reused many times.

For the Java program, lots of garbage will build up, probably eventually
being released in one big go at the end.

The Python loop may (possibly) run slower, but the cost is amortised over

Oh - and if you want "real" GC in Python, simply use Jython. I'm using it
right now for prototyping a Swing application - it's truly wonderful.

Tim Delaney

More information about the Python-list mailing list