[XML-SIG] [ pyxml-Bugs-1231997 ] Memory leak in sgmlop.SGMLParser.register?

SourceForge.net noreply at sourceforge.net
Mon Jul 4 05:31:49 CEST 2005

Bugs item #1231997, was opened at 2005-07-03 22:31
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting: 

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: DOM
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Bryan Rink (holopoj)
Assigned to: Nobody/Anonymous (nobody)
Summary: Memory leak in sgmlop.SGMLParser.register?

Initial Comment:
The following code runs fine:           
from xml.dom.ext.reader import Sgmlop            
from xml.parsers import sgmlop             
while True:             
  a = Sgmlop.HtmlParser()            
  b = sgmlop.SGMLParser()            
  #a.parser = b            
But if the commented line is uncommented this leaks           
memory (very quickly).  The garbage collector must be  
having trouble with the fact the two objects reference  
each other.  
This isn't a contrived example, the code above was   
adopted from lines  48-51 of        
    def initParser(self, parser):        
        self._parser = parser        
And HtmlParser.initParser calls that function like this:     
SgmlopParser.initParser(self, sgmlop.SGMLParser())     
initParser is called from    
xml.ext.dom.reader.HtmlLib.Reader.fromStream which    
is how I came across this error.  I was parsing many    
html documents and creating a new Reader for each    
one.  There is no problem if I use only one reader, so    
that's the solution I will take, but it still seems that the    
first snippet of code above should not leak memory.    


You can respond by visiting: 

More information about the XML-SIG mailing list