memory leak with dynamically defined functions?

zooko at zooko at
Mon Jul 16 16:18:22 EDT 2001

[cross-posting to python-list[1] and mojonation-devel[2]]

Dear Pythonistas, One and All, Large and Small:

It was with some interest that I read on Dr. Dobbs weekly Python-URL [3] that
someone or other on Usenet was asserting that you couldn't write large
distributed applications in a dynamic language without succumbing to bugs.  
I hack on a large, distributed Python application named "Mojo Nation" [4].  

I briefly considered putting Mojo Nation forward as a counterexample, since it
definitely qualifies on the first two criteria (large, distributed), but 
I hesitated to subject myself and my teammates to criticism on the last

Of course, all large applications have bugs, and I'm very proud of Mojo Nation,
but recently we've been struggling with a particularly damaging bug: memory
leakage which causes our application to use up all available RAM after only a
few hours of operation.

In searching for the source of the leakge today, I encountered a surprising
behaviour of Python which I think may be a bug.  If I allocate a bunch of
memory with a function like this:

>>> blarg = {}
>>> for i in range(2**13):
>>>     blarg[i] = [0] * (2**10)

and then remove the references to this memory, like this:

>>> del blarg

then all the memory is freed up.

But if I allocate memory and store a reference to it in a default argument to
an inner function, like this:

>>> def silliest_func():
>>>     x = [0] * (2**10)
>>>     def inner_silliest_func(x=x):
>>>         pass
>>>     return inner_silliest_func
>>> blarg = {}
>>> for i in range(2**13):
>>>     blarg[i] = silliest_func()

and then remove the references to this memory, like this:

>>> del blarg

none of the memory is freed up!

Even stimulating the garbage collector, with:

>>> import gc
>>> gc.collect()

just returns "0" and the memory is still in use.

I tested this with CPython 1.5.2 on Debian and CPython 2.0 on Debian and Red
Hat and they all behaved the same.  I also tested it with JPython 2.1-alpha1
with j2sdk 1.3.1-1 on Debian, and it *did* do the kind of garbage collection
that I expected.  (The JVM wouldn't collect until I reached some very
high limit on total RAM allocated or something.  But I could do the
"silliest_func and then delete the dict" indefinitely, with memory usage
as reported by Linux fluctuating between 30 MB and 60 MB.)

So it appears to me that inner functions and/or their default arguments are not
getting properly collected in CPython.  If this is the case, it poses quite a
problem for Mojo Nation, as we have relied heavily on the convenient closure-
like feature of passing around references to dynamically defined functions with
default arguments.

Thanks, y'all, for the best language and the most friendly language community




More information about the Python-list mailing list