[Python-Dev] refleak hunting season

Michael Hudson mwh at python.net
Mon Aug 2 16:39:36 CEST 2004


Last time round I found a bunch of reference counting bugs in Python
just *after* a major release; I'm trying to do it all before 2.4.0
this time.

I've hacked regrtest again (patch attached).  Here's my notes file,
though as is the case with such things writing stuff down tends to
cause thoughts which cause experiments which cause the notes to be out
of date more or less immediately...

test_codeccallbacks leaked [2, 2, 2, 2] references
    registries

test_copy leaked [82, 82, 82, 82] references
    copy_reg.dispatch_table -- now taken care of in regrtest.py

test_descr leaked [2, 2, 2, 2] references
    resurrection, still

test_descrtut leaked [1, 1, 1, 1] references
    exec 'print x' in defaultdict()
    fixed, need to check in (bit subtle though)

test_distutils leaked [2, 2, 2, 2] references
    no idea

test_gc leaked [18, 18, 18, 18] references
    trashcan

test_hotshot leaked [111, 111, 111, 111] references
    there's been a patch on SF for this for about a year!

test_minidom leaked [1, 1, 1, 1] references
    testPickledDocument, don't know more

test_mutants leaked [74, -1123, 1092, -521] references
    randomized test

test_pkg leaked [3, 3, 3, 3] references
    something to do with execfile?  or tempfile.mkstemp?
    the leak is on the dummy key from dictobject.c!!
    very odd.  see bug #808596 for the explanation.

test_pkgimport leaked [6, 6, 6, 6] references
    no idea
    similar to the above?  (but certainly not the same)
    I think this might be caused by interning like the above.

test_sax leaked [1899, 1899, 1899, 1899] references
    no idea -- scary numbers, though!

test_threadedtempfile leaked [0, 1, -21, 9] references
    randomized test?

test_threading_local leaked [9, 0, 7, 0] references
    not sure what's going on here; not sure it's a leak, but it's
    definitely odd

test_unicode leaked [7, 7, 7, 7] references
    the codec registry

test_urllib2 leaked [120, -80, -70, 120] references
    CacheFTPHandler

test_xrange leaked [1, 1, 1, 1] references
    fixed in CVS

I'd really appreciate someone taking a look at test_sax.  The numbers
for test_gc and test_descr are artifacts and if someone sees how to
fix them, I'd be grateful.  test_threading_local behaves very
strangely.  test_distutils I haven't even begun to look at.

Generally, we're in pretty good shape!  (Or our tests really suck,
your call ;-).

Cheers,
mwh

-------------- next part --------------
Index: regrtest.py
===================================================================
RCS file: /cvsroot/python/python/dist/src/Lib/test/regrtest.py,v
retrieving revision 1.156
diff -c -r1.156 regrtest.py
*** regrtest.py	26 Jul 2004 12:09:13 -0000	1.156
--- regrtest.py	2 Aug 2004 14:31:41 -0000
***************
*** 8,26 ****
  
  Command line options:
  
! -v: verbose   -- run tests in verbose mode with output to stdout
! -q: quiet     -- don't print anything except if a test fails
! -g: generate  -- write the output file for a test instead of comparing it
! -x: exclude   -- arguments are tests to *exclude*
! -s: single    -- run only a single test (see below)
! -r: random    -- randomize test execution order
! -f: fromfile  -- read names of tests to run from a file (see below)
! -l: findleaks -- if GC is available detect tests that leak memory
! -u: use       -- specify which special resource intensive tests to run
! -h: help      -- print this text and exit
! -t: threshold -- call gc.set_threshold(N)
! -T: coverage  -- turn on code coverage using the trace module
! -L: runleaks  -- run the leaks(1) command just before exit
  
  If non-option arguments are present, they are names for tests to run,
  unless -x is given, in which case they are names for tests not to run.
--- 8,27 ----
  
  Command line options:
  
! -v: verbose    -- run tests in verbose mode with output to stdout
! -q: quiet      -- don't print anything except if a test fails
! -g: generate   -- write the output file for a test instead of comparing it
! -x: exclude    -- arguments are tests to *exclude*
! -s: single     -- run only a single test (see below)
! -r: random     -- randomize test execution order
! -f: fromfile   -- read names of tests to run from a file (see below)
! -l: findleaks  -- if GC is available detect tests that leak memory
! -u: use        -- specify which special resource intensive tests to run
! -h: help       -- print this text and exit
! -t: threshold  -- call gc.set_threshold(N)
! -T: coverage   -- turn on code coverage using the trace module
! -L: runleaks   -- run the leaks(1) command just before exit
! -R: huntrleaks -- search for reference leaks (needs debug build, v. slow)
  
  If non-option arguments are present, they are names for tests to run,
  unless -x is given, in which case they are names for tests not to run.
***************
*** 84,89 ****
--- 85,91 ----
  import getopt
  import random
  import warnings
+ import sre
  import cStringIO
  import traceback
  
***************
*** 127,133 ****
  
  def main(tests=None, testdir=None, verbose=0, quiet=False, generate=False,
           exclude=False, single=False, randomize=False, fromfile=None,
!          findleaks=False, use_resources=None, trace=False, runleaks=False):
      """Execute a test suite.
  
      This also parses command-line options and modifies its behavior
--- 129,136 ----
  
  def main(tests=None, testdir=None, verbose=0, quiet=False, generate=False,
           exclude=False, single=False, randomize=False, fromfile=None,
!          findleaks=False, use_resources=None, trace=False, runleaks=False,
!          huntrleaks=False):
      """Execute a test suite.
  
      This also parses command-line options and modifies its behavior
***************
*** 152,162 ****
  
      test_support.record_original_stdout(sys.stdout)
      try:
!         opts, args = getopt.getopt(sys.argv[1:], 'hvgqxsrf:lu:t:TL',
                                     ['help', 'verbose', 'quiet', 'generate',
                                      'exclude', 'single', 'random', 'fromfile',
                                      'findleaks', 'use=', 'threshold=', 'trace',
!                                     'runleaks'
                                      ])
      except getopt.error, msg:
          usage(2, msg)
--- 155,165 ----
  
      test_support.record_original_stdout(sys.stdout)
      try:
!         opts, args = getopt.getopt(sys.argv[1:], 'hvgqxsrf:lu:t:TLR',
                                     ['help', 'verbose', 'quiet', 'generate',
                                      'exclude', 'single', 'random', 'fromfile',
                                      'findleaks', 'use=', 'threshold=', 'trace',
!                                     'runleaks', 'huntrleaks'
                                      ])
      except getopt.error, msg:
          usage(2, msg)
***************
*** 191,196 ****
--- 194,201 ----
              gc.set_threshold(int(a))
          elif o in ('-T', '--coverage'):
              trace = True
+         elif o in ('-R', '--huntrleaks'):
+             huntrleaks = True
          elif o in ('-u', '--use'):
              u = [x.lower() for x in a.split(',')]
              for r in u:
***************
*** 288,294 ****
              tracer.runctx('runtest(test, generate, verbose, quiet, testdir)',
                            globals=globals(), locals=vars())
          else:
!             ok = runtest(test, generate, verbose, quiet, testdir)
              if ok > 0:
                  good.append(test)
              elif ok == 0:
--- 293,299 ----
              tracer.runctx('runtest(test, generate, verbose, quiet, testdir)',
                            globals=globals(), locals=vars())
          else:
!             ok = runtest(test, generate, verbose, quiet, testdir, huntrleaks)
              if ok > 0:
                  good.append(test)
              elif ok == 0:
***************
*** 397,403 ****
      tests.sort()
      return stdtests + tests
  
! def runtest(test, generate, verbose, quiet, testdir=None):
      """Run a single test.
      test -- the name of the test
      generate -- if true, generate output, instead of running the test
--- 402,408 ----
      tests.sort()
      return stdtests + tests
  
! def runtest(test, generate, verbose, quiet, testdir=None, huntrleaks=False):
      """Run a single test.
      test -- the name of the test
      generate -- if true, generate output, instead of running the test
***************
*** 415,420 ****
--- 420,427 ----
          cfp = None
      else:
          cfp = cStringIO.StringIO()
+     if huntrleaks:
+         refrep = open("reflog.txt", "a")
      try:
          save_stdout = sys.stdout
          try:
***************
*** 435,440 ****
--- 442,478 ----
              indirect_test = getattr(the_module, "test_main", None)
              if indirect_test is not None:
                  indirect_test()
+             if huntrleaks:
+                 import copy_reg
+                 fs = warnings.filters[:]
+                 ps = copy_reg.dispatch_table.copy()
+                 import gc
+                 def cleanup():
+                     import _strptime, urlparse, warnings
+                     warnings.filters[:] = fs
+                     gc.collect()
+                     sre.purge()
+                     _strptime._regex_cache.clear()
+                     urlparse.clear_cache()
+                     copy_reg.dispatch_table.clear()
+                     copy_reg.dispatch_table.update(ps)
+                 deltas = []
+                 if indirect_test:
+                     for i in range(9):
+                         rc = sys.gettotalrefcount()
+                         indirect_test()
+                         cleanup()
+                         deltas.append(sys.gettotalrefcount() - rc - 2)
+                 else:
+                     for i in range(9):
+                         rc = sys.gettotalrefcount()
+                         reload(the_module)
+                         cleanup()
+                         deltas.append(sys.gettotalrefcount() - rc - 2)
+                 if max(map(abs, deltas[-4:])) > 0:
+                     print >>refrep, test, 'leaked', \
+                           deltas[-4:], 'references'
+                     #refrep.flush()
          finally:
              sys.stdout = save_stdout
      except test_support.ResourceDenied, msg:
***************
*** 486,492 ****
              fp.close()
          else:
              expected = test + "\n"
!         if output == expected:
              return 1
          print "test", test, "produced unexpected output:"
          sys.stdout.flush()
--- 524,530 ----
              fp.close()
          else:
              expected = test + "\n"
!         if output == expected or huntrleaks:
              return 1
          print "test", test, "produced unexpected output:"
          sys.stdout.flush()
-------------- next part --------------


-- 
  I never disputed the Perl hacking skill of the Slashdot creators. 
  My objections are to the editors' taste, the site's ugly visual 
  design, and the Slashdot community's raging stupidity.
     -- http://www.cs.washington.edu/homes/klee/misc/slashdot.html#faq


More information about the Python-Dev mailing list