[Python-Dev] Python GC/type()/weakref mystery

Kevin Jacobs jacobs at theopalgroup.com
Thu Mar 18 11:06:32 EST 2004


Hi all,

I received a (false) report of a potential memory leak in one of the 
modules that I've
released.  In the process of tracking down what was going on, I stumbled 
upon
some _very_ odd behavior in both Python 2.2.3 and 2.3.3.  What I am 
seeing _may_
be a bug, but it is strange enough that I thought I'd share it here.

This module has a heuristic to detect memory leaks relating to new-style 
classes and
metaclasses that existed in previous Python versions (which Tim and I 
subsequently
tracked down and fixed).  It works by counting how many new objects are 
left behind
after running N iterations of a test suite:

  def leak_test(N, test_func):
    gc.collect()
    orig_objects = len(gc.get_objects())
    for i in xrange(N):
      test_func()
    gc.collect()
    new_objects = len(gc.get_objects()) - orig_objects
    if new_objects > 0:
      print 'Leak detected (N=%d, %d new objects)' % (N,new_objects)

This is a crude heuristic, at best, but was sufficient to detect when my 
module was
running against older Python versions that did not include the necessary 
fixes.  I
left the check in, just in case anything new cropped up.

Here is what is confusing me.  Running this code:

  def new_type():
    type('foo', (object,), {})
    #gc.collect()

  leak_test(50, new_type)

produces: Leak detected (N=50, 50 new objects)

These new objects are all dead weak references, so I am left wondering 
why they seem
to have such odd interactions with garbage collection and manual type 
construction.

After uncomment the gc.collect call in new_type: Leak detected (N=50, 1 
new objects),
so there looks to be something funny going on with the garbage collector.

In contrast,

  class Foo(object): pass
  leak_test(50, Foo)

produces no output.  I.e., the strange dead weak references only seem to 
occur when
manually constructing types via the type constructor.

Varying the value of N also results in strange results:

  for i in range(6):
    leak_test(10**i, new_type)

Produces:
  Leak detected (N=1, 1 new objects)
  Leak detected (N=10, 9 new objects)
  Leak detected (N=100, 90 new objects)
  Leak detected (N=1000, 78 new objects)
  Leak detected (N=10000, 9 new objects)
  Leak detected (N=100000, 8 new objects)

This is good news, in that it does not represent an unbounded memory 
leak, however the
behavior is still very puzzling and may be indicative of a bug, or at 
least an implementation
detail that bears some discussion.

Thanks,
-Kevin




More information about the Python-Dev mailing list