[Python-Dev] Cyclic GC issues

Jason Evans jasone at canonware.com
Mon Oct 11 08:59:52 CEST 2004


On Sun, Oct 10, 2004 at 11:36:11PM +0200, "Martin v. L?wis" wrote:
> >1) I originally had lone rings refer to themselves (ob_refcnt started out
> >   at 3; 2 self-references and one reference held by the associated edge).
> 
> I would have thought the refcount should have been 4: two self
> references, plus one from E, plus one from N (maybe I don't understand
> what make a ring "lone", though).

You understand correctly, except that the R-E-R construct starts out not
being attached to any nodes.  The "lone" ring term was my attempt to
differentiate this:

  R

from these:

  R-R  R---R  R-R
        \ /   | |
         R    R-R

> >   Working around this in Crux requires a lot of extra reference counting
> >   complexity, because there are three different cases for reference
> >   counts, depending on how many members there are in a ring (1, 2, or 3+
> >   members).
> 
> Why is that? Just DECREF every PyObject* in "self", then set the
> PyObject* to NULL. No need for any additional reference counts.

The low level tree code is implemented completely separately from the
Python object wrapper code.  This means that, for example, the Python
"Node" object does not actually store PyObject* pointers to Ring objects;
instead it uses the low level tree code to find out what Ring objects are
attached to it.  Crux was designed this way in order to be able to
implement various low level algorithms such that they never have to call
interpreter-related code.  As such, reference breaking code must actually
tear down the low level tree in such a way that it is always "consistent"
(simply setting pointers to NULL doesn't fit well with this way of doing
things).

> Hmmm. Even if a ring member holds only a single reference to another
> ring member, would not calling visit() on this member in turn invoke
> visit() on the other one, which would, in turn, invoke visit() on
> the next one, and so on?

Yes, this is true.  The problem I ran into had to do with the node
logically holding references to all the ring objects, but only reporting
one of them.  That is, the node's tp_traverse function was not reporting
all references, though it was reporting enough of them to guarantee that
all reachable objects were visited.

At this point, it is clear to me that Python's cyclic GC does not do
mark/sweep GC, and that this won't change, so there's probably not much
point in discussing this issue further.

> >3) A strange thing can happen when tp_clear() is called.  Suppose that an
> >   edge object is being cleared, and it drops its references to the
> >   associated rings.  If ob_refcnt of one of the rings drops to 0 as a
> >   result, Python will tp_dealloc() the ring *right* *now*, without ever
> >   calling tp_clear() on the ring.  That means that I have to keep track of
> >   whether tp_clear() has been called on objects, if it is at all important
> >   that tp_clear() be called, so that I can manually do so in tp_dealloc(),
> >   if necessary.  It is in my opinion reasonable to have cleanup code in
> >   tp_clear(), with the assumption that it will be called precisely once,
> >   but Python makes me do extra work to make sure that this happens.
> 
> Your expectation is wrong. tp_clear is not the place to perform cleanup.
> It is the place to break cycles. tp_dealloc is the place to perform
> cleanup.

What if breaking cycles and cleanup are the same thing?  In the case of
Crux, they are one and the same, since the low level tree must be torn down
one piece at a time.

Apparently I was mistaken in viewing tp_clear and tp_dealloc as two stages
of destruction.

> >4) There is no way to control the order in which objects are
> >   tp_dealloc()ed.
> 
> Correct. Unfortunately, there is no way to change that. For cyclic
> structures, there is no natural "starting point". Python picks an
> arbitrary starting point, assuming that the order does not matter.
> If class objects have an __del__, it assumes that the order does
> matter, and gives up - putting the objects into gc.garbage.
> 
> >   This could be addressed in the Python interpreter by paying heed to the
> >   return value of tp_dealloc().  If the return value is non-zero, move the
> >   object to the end of the list of objects to be destroyed, so that
> >   destruction is tried later.  This allows the module to impose its own
> >   destruction ordering.
> 
> I don't understand. tp_dealloc does not have a return value, so what
> does "pay heed" mean?

I should have said tp_clear here.  In Modules/gcmodule.c:delete_garbage(),
the clear() call does not pay attention to the return value.  It would be
possible to check the return value, and defer deallocation if the return
value were non-zero, and ob_refcnt were 1.  (However, it would probably
have to be a separate list of objects, contrary to what I claimed before.)

Thanks for your comments, Martin.  I'm starting to understand the Python
GC world view a bit better.

Jason


More information about the Python-Dev mailing list