[Python-Dev] cycle-GC question

Martin v. Loewis martin@loewis.home.cs.tu-berlin.de
Tue, 19 Dec 2000 16:44:48 +0100


> It seems clear (?) that this is due to the call to "set_rexec" at
> rexec.py:140, which creates a circular reference between the `rexec'
> and `hooks' objects.  (There's even a nice comment to that effect).

It's not all that clear that *this* is the cycle. In fact, it is not.

> I'm curious however as to why the spiffy new cyclic-garbage
> collector doesn't pick this up?

It's an interesting problem, so I spent this afternoon investigating
it. I soon found that I need a tool, so I introduced a new function
gc.getreferents which, when given an object, returns a list of objects
referring to that object. The patch for that feature is in

http://sourceforge.net/patch/?func=detailpatch&patch_id=102925&group_id=5470

Applying that function recursively, I can get an output that looks
like that:

<rexec.RExec instance at 0x81f5dcc>
 <method RExec.r_import of RExec instance at 0x81f5dcc>
  dictionary 0x81f4f24
 <method RExec.r_reload of RExec instance at 0x81f5dcc>
  dictionary 0x81f4f24 (seen)
 <method RExec.r_open of RExec instance at 0x81f5dcc>
  dictionary 0x81f4f24 (seen)
 <method RExec.r_exc_info of RExec instance at 0x81f5dcc>
  dictionary 0x8213bc4
 dictionary 0x820869c
  <rexec.RHooks instance at 0x8216cbc>
   dictionary 0x820866c
    <rexec.RExec instance at 0x81f5dcc> (seen)
   dictionary 0x8213bf4
    <ihooks.FancyModuleLoader instance at 0x81f7464>
     dictionary 0x820866c (seen)
     dictionary 0x8214144
      <ihooks.ModuleImporter instance at 0x8214124>
       dictionary 0x820866c (seen)

Each indentation level shows the objects which refer to the outer-next
object, e.g. the dictionary 0x820869c refers to the RExec instance,
and the RHooks instance refers to that dictionary. Clearly, the
dictionary 0x820869c is the RHooks' __dict__, and the reference
belongs to the 'rexec' key in that dictionary.

The recursion stops only when an object has been seen before (so its a
cycle, or other non-tree graph), or if there are no referents (the
lists created to do the iteration are ignored).

So it appears that the r_import method is referenced from some
dictionary, but that dictionary is not referenced anywhere???

Checking the actual structures shows that rexec creates a __builtin__
module, which has a dictionary that has an __import__ key. So the
reference to the method comes from the __builtin__ module, which in
turn is referenced as the RExec's .modules attribute, giving another
cycle.

However, module objects don't participate in garbage
collection. Therefore, gc.getreferents cannot traverse a module, and
the garbage collector won't find a cycle involving a garbage module.
I just submitted a bug report,

http://sourceforge.net/bugs/?func=detailbug&bug_id=126345&group_id=5470

which suggests that modules should also participate in garbage
collection.

Regards,
Martin