[XML-SIG] [ pyxml-Bugs-1231997 ] Memory leak in sgmlop.SGMLParser.register?
SourceForge.net
noreply at sourceforge.net
Mon Jul 4 05:31:49 CEST 2005
Bugs item #1231997, was opened at 2005-07-03 22:31
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=106473&aid=1231997&group_id=6473
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: DOM
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Bryan Rink (holopoj)
Assigned to: Nobody/Anonymous (nobody)
Summary: Memory leak in sgmlop.SGMLParser.register?
Initial Comment:
The following code runs fine:
from xml.dom.ext.reader import Sgmlop
from xml.parsers import sgmlop
while True:
a = Sgmlop.HtmlParser()
b = sgmlop.SGMLParser()
#a.parser = b
b.register(a)
But if the commented line is uncommented this leaks
memory (very quickly). The garbage collector must be
having trouble with the fact the two objects reference
each other.
This isn't a contrived example, the code above was
adopted from lines 48-51 of
xml.dom.reader.Sgmlop.py:
def initParser(self, parser):
self._parser = parser
self._parser.register(self)
return
And HtmlParser.initParser calls that function like this:
SgmlopParser.initParser(self, sgmlop.SGMLParser())
initParser is called from
xml.ext.dom.reader.HtmlLib.Reader.fromStream which
is how I came across this error. I was parsing many
html documents and creating a new Reader for each
one. There is no problem if I use only one reader, so
that's the solution I will take, but it still seems that the
first snippet of code above should not leak memory.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=106473&aid=1231997&group_id=6473
More information about the XML-SIG
mailing list