Tim Peters wrote:
I think in Python 2.0 it would be nice to have some way to reclaim circular dependencies without the programmer explicitly having to do something ...
This was debated (again) at great length on c.l.py just a few months ago. Guido chimed in with a proposal to keep track of only the dicts that have been allocated, and now and again mark everything reachable from the root set and nuke whatever dicts don't end up marked. Cycles involving dicts would get reclaimed this way, but not cycles not involving dicts. The approach to destructors for objects in cycles was "tough -- they don't get called". What to do about destructors for objects that are not themselves involved in cycles but are reachable only from dead cycles (so are in fact dead too) wasn't addressed. Seemed possible that stuff reachable from ordinary dicts (not in a cycle, and neither reachable from a cycle) would behave differently than today, since the "list of all dicts" may keep the stuff artificially alive until the next mark+sweep, even if the refcount on the stuff fell to zero; there's probably an OK way around that, though.
You could probably tackle the problem by doing local mark&sweep whenever the ref count on a dictionary falls down to 1 (meaning that it is only referenced from the list of all dicts). This is what I do in mxProxy's weak reference implementation and to my surprise it solved all those strange situations where objects are kept alive longer than they would have normally.
Anyway, Guido was aiming for the minimal changes that could possibly do real good. It didn't pretend to reclaim all cycles, and was (IMO) too eager to punt on the hard issues (the combo of cycles, destructors and resurrection is a god-awful mess, even in theory; Scheme uses callbacks to dump the problems back on the users Java has incredibily elaborate rules that are both bulletproof and unusable; the Boehm collector lets objects with destructors that are in cycles simply leak, rather than do a wrong thing; Stroustrup has flip-flopped and most recently argued for Guido's "reclaim the memory but don't call the destructors" approach, but a member of the C++ committee told me he's overwhelmingly opposed on this one (I know I would oppose it)).
Not calling the destructor will cause leakage in all objects allocating extra storage, such as lists, instances and probably just about any dynamically sized object there is in Python... solving the problem only half way. Plus you will definitely run into trouble as soon as external resources are involved, e.g. open files or connections to databases.
Perhaps we should give more power to the user instead of trying to give him fuzzy feelings about what's happening underneath the hood. Builtin weak references or other indirect ways of accessing objects (e.g. by giving unique names to the involved objects) can solve many of those circ. ref. problems.
BTW, I usually use an instrumented Python interpreter to track down circular references: it uses a tracing hook in the allocation/deallocation code of Python instances which is used when Python is run in debugging mode (python -d). The hook calls a function sys.traceinstances (if present) which allows me to keep a track record of all allocated instances:
""" Tracing hook. This is called whenever an instances is created and destroyed. action is either 'create' or 'delete'; inst points to the instance object. """ ...
If anyone is interested I can post the patch (against Python 1.5).