[ python-Bugs-1294453 ] email.Parser.FeedParser leak
SourceForge.net
noreply at sourceforge.net
Mon Sep 19 04:57:54 CEST 2005
Bugs item #1294453, was opened at 2005-09-18 04:46
Message generated for change (Comment added) made by montanaro
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1294453&group_id=5470
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Python Library
Group: Python 2.4
Status: Open
Resolution: None
Priority: 5
Submitted By: George Giannakopoulos (pckid)
Assigned to: Barry A. Warsaw (bwarsaw)
Summary: email.Parser.FeedParser leak
Initial Comment:
It seems there is a reference cycle within the
FeedParser class.
I discovered it while implementing a mail
categorization app. It seems that the problem lies in
the line:
self._parse = self._parsegen().next
of the FeedParser __init__ method.
The object cannot be deleted and I was forced to add
the line:
self._parse = None
in the close() method of the class just before the
return call.
It seems it actually corrects the situation, BUT the
_parse method is no longer valid, and the object should
no longer be used.
If it makes any difference, the FeedParser was called
by a use of the Parser class:
pParser = email.Parser.Parser()
mMessage = pParser.parsestr(sMessageString)
del pParser
----------------------------------------------------------------------
>Comment By: Skip Montanaro (montanaro)
Date: 2005-09-18 21:57
Message:
Logged In: YES
user_id=44345
Here's what I see on my Mac laptop (10.3.9) with Python 2.4.1:
montanaro:skip% ps auxww | egrep python2.4
skip 10914 97.1 0.6 37980 6692 p8 R+ 9:54PM 0:15.50
python2.4
skip 10926 0.0 0.0 18644 268 std U+ 9:55PM 0:00.00 egrep
python2.4
montanaro:skip% ps auxww | egrep python2.4
skip 10914 94.7 0.6 37980 6724 p8 R+ 9:54PM 0:20.75
python2.4
skip 10928 0.0 0.0 18644 92 std R+ 9:55PM 0:00.00 egrep
python2.4
montanaro:skip% ps auxww | egrep python2.4
skip 10914 91.7 0.6 37980 6748 p8 R+ 9:54PM 0:24.36
python2.4
skip 10930 0.0 0.0 18644 92 std R+ 9:55PM 0:00.00 egrep
python2.4
montanaro:skip% ps auxww | egrep python2.4
skip 10914 90.4 0.6 37980 6780 p8 R+ 9:54PM 0:29.36
python2.4
skip 10932 0.0 0.0 18644 92 std R+ 9:55PM 0:00.00 egrep
python2.4
montanaro:skip% ps auxww | egrep python2.4
skip 10914 75.6 0.6 37980 6808 p8 R+ 9:54PM 0:33.21
python2.4
skip 10934 0.0 0.0 18644 92 std R+ 9:55PM 0:00.00 egrep
python2.4
montanaro:skip% ps auxww | egrep python2.4
skip 10914 91.9 0.7 37980 6848 p8 R+ 9:54PM 0:36.86
python2.4
skip 10939 0.0 0.0 18644 92 std R+ 9:55PM 0:00.00 egrep
python2.4
montanaro:skip% ps auxww | egrep python2.4
skip 10914 90.0 0.7 37980 6928 p8 R+ 9:54PM 1:34.41
python2.4
skip 10998 0.0 0.0 18644 92 std R+ 9:57PM 0:00.01 egrep
python2.4
montanaro:skip% ps auxww | egrep python2.4
skip 10914 95.3 0.7 37980 6952 p8 R+ 9:54PM 1:46.65
python2.4
skip 11000 0.0 0.0 18644 92 std R+ 9:57PM 0:00.00 egrep
python2.4
----------------------------------------------------------------------
Comment By: Barry A. Warsaw (bwarsaw)
Date: 2005-09-18 21:32
Message:
Logged In: YES
user_id=12800
Done. I never see memory usage get about 0.8% (py2.4) or
0.7% (py2.5) after running for several minutes. It
certainly doesn't appear to be leaking memory of any
detectable amount.
If it matters, I tested on Linux (Gentoo) 2.6.12 kernel.
----------------------------------------------------------------------
Comment By: Skip Montanaro (montanaro)
Date: 2005-09-18 16:20
Message:
Logged In: YES
user_id=44345
Try running top as the loop executes. Let it run for a couple minutes...
----------------------------------------------------------------------
Comment By: Barry A. Warsaw (bwarsaw)
Date: 2005-09-18 16:08
Message:
Logged In: YES
user_id=12800
Hmm, in Python 2.4 CVS, this always returns 0:
import gc
import email.Parser
s = open('/tmp/msg.txt').read()
try:
while True:
parser = email.Parser.Parser()
msg = parser.parsestr(s)
del parser
except KeyboardInterrupt:
print len(gc.garbage)
Same thing In Python 2.5 CVS. So where's the leak?
Note that it's undefined what the FeedParser does after you
call its close. It doesn't seem like a problem to set
self._parser = None in the close, if that fixes a problem,
but it's a little odd that the above program doesn't
reproduce the reported bug.
----------------------------------------------------------------------
Comment By: Skip Montanaro (montanaro)
Date: 2005-09-18 14:13
Message:
Logged In: YES
user_id=44345
Using Python built from CVS and from the 2.4
maintenance branch I executed:
s = open("... some file containing a message ...").read()
while True:
parser = email.Parser.Parser()
msg = parser.parsestr(s)
and let it accumulate a couple minutes of CPU time. It leaks
in the 2.4 version, but not the head (2.5) version. The two
versions of the email package appear identical (diff -ru). I
made a slightly different change. Instead of using
self._parse at all, I just replaced it with self._parsegen().next.
Memory consumption continues to grow for me.
Oddly enough, if I break out of the above loop, do a gc.collect()
and then check gc.garbage, the CVS HEAD version shows a
list of one element containing a generator. In the 2.4 release
branch version gc.garbage is empty.
Assigning to Barry as the email wiz...
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1294453&group_id=5470
More information about the Python-bugs-list
mailing list