[Python-Dev] Re: [Patches] Reference cycle collection for Python

Wed, 01 Mar 2000 00:44:10 -0500

[I don't like to cross-post to patches and python-dev, but I think
this belongs in patches because it's a followup to Neil's post there
and also in -dev because of its longer-term importance.]

Thanks for the new patches, Neil!

We had a visitor here at CNRI today, Eric Tiedemann
<est@hyperreal.org>, who had a look at your patches before.  Eric
knows his way around the Scheme, Lisp and GC literature, and presented
a variant on your approach which takes the bite out of the recursive
passes.

Eric had commented earlier on Neil's previous code, and I had used the
morning to make myself familiar with Neil's code.  This was relatively
easy because Neil's code is very clear.

Today, Eric proposed to do away with Neil's hash table altogether --
as long as we're wasting memory, we might as well add 3 fields to each
container object rather than allocating the same amount in a separate
hash table.  Eric expects that this will run faster, although this
obviously needs to be tried.

Container types are: dict, list, tuple, class, instance; plus
potentially user-defined container types such as kjbuckets.  I have a
feeling that function objects should also be considered container
types, because of the cycle involving globals.

Eric's algorithm, then, consists of the following parts.

Each container object has three new fields: gc_next, gc_prev, and
gc_refs.  (Eric calls the gc_refs "refcount-zero".)

We color objects white (initial), gray (root), black (scanned root).
(The terms are explained later; we believe we don't actually need bits
in the objects to store the color; see later.)

All container objects are chained together in a doubly-linked list --
this is the same as Neil's code except Neil does it only for dicts.
(Eric postulates that you need a list header.)

When GC is activated, all objects are colored white; we make a pass
over the entire list and set gc_refs equal to the refcount for each
object.

Next, we make another pass over the list to collect the internal
references.  Internal references are (just like in Neil's version)
references from other container types.  In Neil's version, this was
recursive; in Eric's version, we don't need recursion, since the list
already contains all containers.  So we simple visit the containers in
the list in turn, and for each one we go over all the objects it
references and subtract one from *its* gc_refs field.  (Eric left out
the little detail that we ened to be able to distinguish between
container and non-container objects amongst those references; this can
be a flag bit in the type field.)

Now, similar to Neil's version, all objects for which gc_refs == 0
have only internal references, and are potential garbage; all objects
for which gc_refs > 0 are "roots".  These have references to them from
other places, e.g. from globals or stack frames in the Python virtual
machine.

We now start a second list, to which we will move all roots.  The way
to do this is to go over the first list again and to move each object
that has gc_refs > 0 to the second list.  Objects placed on the second
list in this phase are considered colored gray (roots).

Of course, some roots will reference some non-roots, which keeps those
non-roots alive.  We now make a pass over the second list, where for
each object on the second list, we look at every object it references.
If a referenced object is a container and is still in the first list
(colored white) we *append* it to the second list (colored gray).
Because we append, objects thus added to the second list will
eventually be considered by this same pass; when we stop finding
objects that sre still white, we stop appending to the second list,
and we will eventually terminate this pass.  Conceptually, objects on
the second list that have been scanned in this pass are colored black
(scanned root); but there is no need to to actually make the
distinction.

(How do we know whether an object pointed to is white (in the first
list) or gray or black (in the second)?  We could use an extra
bitfield, but that's a waste of space.  Better: we could set gc_refs
to a magic value (e.g. 0xffffffff) when we move the object to the
second list.  During the meeting, I proposed to set the back pointer
to NULL; that might work too but I think the gc_refs field is more
elegant.  We could even just test for a non-zero gc_refs field; the
roots moved to the second list initially all have a non-zero gc_refs
field already, and for the objects with a zero gc_refs field we could
indeed set it to something arbitrary.)

Once we reach the end of the second list, all objects still left in
the first list are garbage.  We can destroy them in a similar to the
way Neil does this in his code.  Neil calls PyDict_Clear on the
dictionaries, and ignores the rest.  Under Neils assumption that all
cycles (that he detects) involve dictionaries, that is sufficient.  In
our case, we may need a type-specific "clear" function for containers
in the type object.

We discussed more things, but not as thoroughly.  Eric & Eric stressed
the importance of making excellent statistics available about the rate
of garbage collection -- probably as data structures that Python code
can read rather than debugging print statements.  Eric T also sketched
an incremental version of the algorithm, usable for real-time
applications.  This involved keeping the gc_refs field ("external"
reference counts) up-to-date at all times, which would require two
different versions of the INCREF/DECREF macros: one for
adding/deleting a reference from a container, and another for
adding/deleting a root reference.  Also, a 4th color (red) was added,
to distinguish between scanned roots and scanned non-roots.  We
decided not to work this out in more detail because the overhead cost
appeared to be much higher than for the previous algorithm; instead,
we recommed that for real-time requirements the whole GC is disabled
(there should be run-time controls for this, not just compile-time).
We also briefly discussed possibilities for generational schemes.

The general opinion was that we should first implement and test the
algorithm as sketched above, and then changes or extensions could be
made.

I was pleasantly surprised to find Neil's code in my inbox when we
came out of the meeting; I think it would be worthwhile to compare and
contrast the two approaches.  (Hm, maybe there's a paper in it?)

The rest of the afternoon was spent discussing continuations,
coroutines and generators, and the fundamental reason why
continuations are so hard (the C stack getting in the way everywhere).
But that's a topic for another mail, maybe.

--Guido van Rossum (home page: http://www.python.org/~guido/)