[issue2246] itertools.groupby() leaks memory with circular reference
Jeroen Ruigrok van der Werven
report at bugs.python.org
Thu Mar 6 20:39:29 CET 2008
New submission from Jeroen Ruigrok van der Werven:
Quoting from my email to Raymond:
In the Trac/Genshi community we've been tracking a bit obscure memory
leak that causes us a lot of problems.
Please see http://trac.edgewall.org/ticket/6614 and then
http://genshi.edgewall.org/ticket/190 for background.
We reduced the case to the following Python only code and believe it is
a bug within itertool's groupby. As Armin Ronacher explains in Genshi
ticket 190:
"Looks like genshi is not to blame. itertools.groupby has a grouper
with a reference to the groupby type but no traverse func. As soon as a
circular reference ends up in the groupby (which happens thanks to the
func_globals in our lambda) genshi leaks."
This can be demonstrated with the following code (testcase attachment
present with this issue):
import gc
from itertools import groupby
def run():
keyfunc = lambda x: x
for i, j in groupby(range(100), key=keyfunc):
keyfunc.x = j
for x in xrange(20):
gc.collect()
run()
print len(gc.get_objects())
On executing this in will show numerical output of the garbage
collector, but every iteration will be +4 from the previous, as Armin
specifies:
"a frame, a grouper, a keyfunc and a groupby object"
We have been unable to come up with a decent patch and thus I am
logging this issue now.
----------
files: testcase.py
messages: 63332
nosy: asmodai, rhettinger
severity: normal
status: open
title: itertools.groupby() leaks memory with circular reference
type: resource usage
versions: Python 2.4, Python 2.5, Python 2.6, Python 3.0
Added file: http://bugs.python.org/file9624/testcase.py
__________________________________
Tracker <report at bugs.python.org>
<http://bugs.python.org/issue2246>
__________________________________
More information about the Python-bugs-list
mailing list