[Python-Dev] Tackling circular dependencies in 2.0?

M.-A. Lemburg mal@lemburg.com
Tue, 21 Sep 1999 11:05:05 +0200

Tim Peters wrote:
> [Skip Montanaro]
> > I think in Python 2.0 it would be nice to have some way to
> > reclaim circular dependencies without the programmer explicitly
> > having to do something ...
> This was debated (again) at great length on c.l.py just a few months ago.
> Guido chimed in with a proposal to keep track of only the dicts that have
> been allocated, and now and again mark everything reachable from the root
> set and nuke whatever dicts don't end up marked.  Cycles involving dicts
> would get reclaimed this way, but not cycles not involving dicts.  The
> approach to destructors for objects in cycles was "tough -- they don't get
> called".  What to do about destructors for objects that are not themselves
> involved in cycles but are reachable only from dead cycles (so are in fact
> dead too) wasn't addressed.  Seemed possible that stuff reachable from
> ordinary dicts (not in a cycle, and neither reachable from a cycle) would
> behave differently than today, since the "list of all dicts" may keep the
> stuff artificially alive until the next mark+sweep, even if the refcount on
> the stuff fell to zero; there's probably an OK way around that, though.

You could probably tackle the problem by doing local mark&sweep whenever
the ref count on a dictionary falls down to 1 (meaning that it is only
referenced from the list of all dicts). This is what I do in mxProxy's
weak reference implementation and to my surprise it solved all those
strange situations where objects are kept alive longer than they
would have normally.
> Anyway, Guido was aiming for the minimal changes that could possibly do real
> good.  It didn't pretend to reclaim all cycles, and was (IMO) too eager to
> punt on the hard issues (the combo of cycles, destructors and resurrection
> is a god-awful mess, even in theory; Scheme uses callbacks to dump the
> problems back on the users Java has incredibily elaborate rules that are
> both bulletproof and unusable; the Boehm collector lets objects with
> destructors that are in cycles simply leak, rather than do a wrong thing;
> Stroustrup has flip-flopped and most recently argued for Guido's "reclaim
> the memory but don't call the destructors" approach, but a member of the C++
> committee told me he's overwhelmingly opposed on this one (I know I would
> oppose it)).

Not calling the destructor will cause leakage in all objects
allocating extra storage, such as lists, instances and
probably just about any dynamically sized object there is in
Python... solving the problem only half way. Plus you will
definitely run into trouble as soon as external resources
are involved, e.g. open files or connections to databases.

Perhaps we should give more power to the user instead of trying
to give him fuzzy feelings about what's happening underneath
the hood. Builtin weak references or other indirect ways
of accessing objects (e.g. by giving unique names to the
involved objects) can solve many of those circ. ref. problems.

BTW, I usually use an instrumented Python interpreter to track
down circular references: it uses a tracing hook in the 
allocation/deallocation code of Python instances which is used
when Python is run in debugging mode (python -d). The hook
calls a function sys.traceinstances (if present) which allows
me to keep a track record of all allocated instances:

def traceinstances(action,inst):

    """ Tracing hook. This is called whenever an instances is created
        and destroyed. action is either 'create' or 'delete'; inst
	points to the instance object.

If anyone is interested I can post the patch (against Python 1.5).

Marc-Andre Lemburg
Y2000:                                                   101 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/