[Python-Dev] Some dull gc stats

Tim Peters tim.one@comcast.net
Mon, 01 Jul 2002 01:45:20 -0400


I checked in a surprisingly large patch to change the way we collect a
generation.

Before:  Things directly reachable from outside the young generation were
moved into a 'reachable' set in one pass.  Things indirectly reachable from
outside the young generation were moved into 'reachable' in a second pass.
The 'young' list then contained the unreachable objects.

After:  Things unreachable from outside the young generation are moved into
an 'unreachable' set in one pass.  The 'young' list contains the reachable
objects when that's done.

The point was that almost everything is reachable in the end, and moving an
object between lists costs six pointer stores (updating prev and next
pointers in the object, and in each of the two lists).  So if most stuff is
doomed to be reachable in the end, better to move the unreachable stuff than
to move the reachable stuff.

This seems to be a nice little win.  If you want to know more, read the
comments.  An instrumented version showed this over a run of the Python test
suite:

scanned   7437363
moved       36854
movedback   34389

where

    scanned
        # of times the loop in move_unreachable() went around == # of
        objects
    moved
        # of times "if (gc->gc.gc_refs == 0)" in move_unreachable()
        triggered == # of objects moved into an unreachable set
    movedback
        # of times "if (gc_refs == GC_TENTATIVELY_UNREACHABLE)"
        in visit_reachable() triggered == the number of times
        move_unreachable() guessed wrong and an object had to be moved
        back into a reachable set

So the change saved about 7e6 object moves here, for 6x as many pointer
stores, and gc is finding very little that's unreachable in the end.

Surprisingly, the worst (for some technical meaning of "worst" I'll leave to
your imagination) stats I've seen came out of running Zope3's test suite:

scanned    649444
moved       56124
movedback   43576

It's surprising for lots of reasons, including how relatively little work gc
is doing in total, and how relatively much was found to be unreachable
(about 12 thousand objects).  The latter is surprising because Zope code has
traditionally tried like heck not to create cycles.

The Python test suite *almost* gave "the best" (most favorable to the
change) stats I've seen.  Only a variant of Kevin Jacobs's little test case
looked better so far:

scanned    12322200
moved           244
movedback       244

It would be nicer if we could drive scanned there down to 0 <wink>.