memory leak with dynamically defined functions?

zooko at zooko.com zooko at zooko.com
Mon Jul 16 16:18:22 EDT 2001


[cross-posting to python-list[1] and mojonation-devel[2]]


Dear Pythonistas, One and All, Large and Small:

It was with some interest that I read on Dr. Dobbs weekly Python-URL [3] that
someone or other on Usenet was asserting that you couldn't write large
distributed applications in a dynamic language without succumbing to bugs.  
I hack on a large, distributed Python application named "Mojo Nation" [4].  

I briefly considered putting Mojo Nation forward as a counterexample, since it
definitely qualifies on the first two criteria (large, distributed), but 
I hesitated to subject myself and my teammates to criticism on the last
(non-buggy).

Of course, all large applications have bugs, and I'm very proud of Mojo Nation,
but recently we've been struggling with a particularly damaging bug: memory
leakage which causes our application to use up all available RAM after only a
few hours of operation.

In searching for the source of the leakge today, I encountered a surprising
behaviour of Python which I think may be a bug.  If I allocate a bunch of
memory with a function like this:

>>> blarg = {}
>>> for i in range(2**13):
>>>     blarg[i] = [0] * (2**10)

and then remove the references to this memory, like this:

>>> del blarg

then all the memory is freed up.

But if I allocate memory and store a reference to it in a default argument to
an inner function, like this:

>>> def silliest_func():
>>>     x = [0] * (2**10)
>>>     def inner_silliest_func(x=x):
>>>         pass
>>>     return inner_silliest_func
>>> 
>>> blarg = {}
>>> for i in range(2**13):
>>>     blarg[i] = silliest_func()

and then remove the references to this memory, like this:

>>> del blarg

none of the memory is freed up!

Even stimulating the garbage collector, with:

>>> import gc
>>> gc.collect()
0

just returns "0" and the memory is still in use.

I tested this with CPython 1.5.2 on Debian and CPython 2.0 on Debian and Red
Hat and they all behaved the same.  I also tested it with JPython 2.1-alpha1
with j2sdk 1.3.1-1 on Debian, and it *did* do the kind of garbage collection
that I expected.  (The JVM wouldn't collect until I reached some very
high limit on total RAM allocated or something.  But I could do the
"silliest_func and then delete the dict" indefinitely, with memory usage
as reported by Linux fluctuating between 30 MB and 60 MB.)

So it appears to me that inner functions and/or their default arguments are not
getting properly collected in CPython.  If this is the case, it poses quite a
problem for Mojo Nation, as we have relied heavily on the convenient closure-
like feature of passing around references to dynamically defined functions with
default arguments.

Thanks, y'all, for the best language and the most friendly language community
around.

Regards,

Zooko

[1] http://mail.python.org/mailman/listinfo/python-list
[2] http://lists.sourceforge.net/lists/listinfo/mojonation-devel
[3] http://lwn.net/daily/pyurl-20010716.php3
[4] http://mojonation.net/
[5] http://zooko.com/memleak.py





More information about the Python-list mailing list