Reference counting garbage collection

Skip Montanaro skip at pobox.com
Wed Aug 22 06:55:16 EDT 2001


    Paul> Is there any particular reason Python uses reference counting
    Paul> garbage collection (which leaks memory if there's circular
    Paul> structure) instead of Lisp-style garbage collection?  As Python
    Paul> gets used for more large application development, it gets
    Paul> troublesome to make the programmer worry about whether the data
    Paul> has cycles.

As Simon Brunning pointed out already, Python since 2.0 uses reference
counting for the usual stuff and has a garbage collector just for the cyclic
stuff (lists, tuples, dicts and maybe a couple other object types -
extension module programmers can hook into the gc system or not, as their
needs dictate).  I believe the main reason Python doesn't use garbage
collection for the whole kit-n-kaboodle is that garbage collection
introduces non-determinism in the actual time an object gets reclaimed.  A
common argument used is that of memory vs. much more limited resources like
file descriptors.  Common usage like

    for f in glob.glob("*.c"):
        file = open(f)
        do_fun_stuff(file)

could easily run out of file descriptors for a large directory before a gc
pass was required.  Obviously, the programmer can do something like
"file.close()" or "del f" to force file closure, but that just shifts the
problem of reclamation from the system to the programmer and may actually
introduce bugs.  If I changed the above loop to

    for f in glob.glob("*.c"):
        file = open(f)
        do_fun_stuff(file)
        file.close()

there's no guarantee that the program is actually done with the file object
(do_fun_stuff may sometimes tuck away a reference to the in another data
structure based upon what it reads), so closing it before its reference
count reaches zero would be a bug.

Of course, programming in a manner that relied on such obscure semantics
would itself be error-prone, so this example is perhaps only really useful
to demonstrate why gc is a tool of the devil and should be avoided whenever
possible. ;-)

choose-your-poison-ly y'rs,

-- 
Skip Montanaro (skip at pobox.com)
http://www.mojam.com/
http://www.musi-cal.com/




More information about the Python-list mailing list