Mailman 3 [lxml-dev] Huge memory leak in latest 2.0 - lxml - The Python XML Toolkit

Dec. 7, 2007

      Hi.

I'm using latest 2.0 version from trunk, rev. 49494 (because it supports 
'encoding' keyword in HTMLParser). I'm parsing many HTML documents in 
loop, 100-200kB each. I have noticed that memory used by my program 
increases about 1MB after each document processed, so after a few 
hundreds of passes system is about to hang. Running the same code with 
lxml 1.3.6 doesn't cause such memory usage increase.

I'm using the following library calls:
tree = etree.parse( <opened file>, HTMLParser(encoding=...))
etree.tostring(tree)
el.xpath(...)
getting children and attributes of elements

I'm using libxml2 version 2.6.28.

If anyone knows about solution/workaround, please write.

Regards,
Artur

[lxml-dev] Huge memory leak in latest 2.0

Artur Siekielski

Stefan Behnel

Stefan Behnel

Stefan Behnel

tags

participants (2)