Reference counting garbage collection

Tim Peters tim.one at home.com
Thu Aug 23 03:25:11 CEST 2001


[Paul Rubin]
> Is there any particular reason Python uses reference counting garbage
> collection

+ 100% portable out of the box, even to bizarre platforms.

+ Simple M&S is slower given Python's extremely high dynamic memory
throughput (*everything* in Python is heap-allocated, from execution frames
to ints, and RC is cache-friendly by nature, eager to reuse a piece of
memory ASAP; just picture "sum += i" in a loop, while keeping
"*everything's* heap-allocated" in mind).

+ Sharing data with random C, Fortran, Java etc programs is a godawful mess
when GC decides to move things around.  Python never moves objects, and
that's a major blessing when integrating with foreign systems.

> (which leaks memory if there's circular structure)

Recent Pythons collect those.

> instead of Lisp-style garbage collection?

Afraid that doesn't mean anything specific without reference to a specific
Lisp implementation; you seem to have compacting M&S in mind, perhaps
generational or perhaps not, but, regardless, the seminal role Lisp played
in driving GC algorithms means there's virtually nothing that isn't "Lisp-
style" for *some* value of "Lisp".

> As Python gets used for more large application development, it gets
> troublesome to make the programmer worry about whether the data has
> cycles.

Cycles are irrelevant now.  While it *is* a PITA to worry about refcounts,
it would be a bigger PITA to worry about memory moving.

> Also, using a compacting GC helps localize memory references, speeding
> up the program by improving cache performance.

If you dig for solid evidence that this is true in real programs, you may be
surprised at how little you find.  Try this for a starting point on why
deeper analysis suggests life is more complicated than picturing easy worst
cases (notwithstanding that I invited you to picture one above <wink>):

    http://www.hpl.hp.com/personal/Hans_Boehm/gc/complexity.html

> ...
> So maybe the GC implementation should be revisited at some point.

It's been revisited several times by several people.  The only thing the
implemented alternatives to date have consistently had in common is worse
performance.

ain't-nothin'-obvious-about-gc-strategies-ly y'rs  - tim





More information about the Python-list mailing list