New subject: Memory leak when parsing XML files in sequence?

Nov. 23, 2016

      In-Reply-To=<4EA045E4.9020508@anaproy.nl>

Has there been any advancements regarding this memory leak?

I built the newest version of lxml (as well as its dependencies) and the
problem is still there. I was able to track it down using umdh on Windows:

etree!xmlDictLookup+0000025E (c:\tmp\libxml2-win-binaries\libxml2\dict.c,
933)
etree!xmlHashAddEntry3+00000053 (c:\tmp\libxml2-win-binaries\libxml2\hash.c,
532)
etree!xmlHashAddEntry+00000014 (c:\tmp\libxml2-win-binaries\libxml2\hash.c,
377)
etree!xmlAddID+0000011D (c:\tmp\libxml2-win-binaries\libxml2\valid.c, 2632)
etree!xmlSAX2AttributeInternal+0000078A
(c:\tmp\libxml2-win-binaries\libxml2\sax2.c,
1411)
etree!xmlSAX2StartElement+000002AE (c:\tmp\libxml2-win-binaries\libxml2\sax2.c,
1743)
etree!htmlParseStartTag+00000579
(c:\tmp\libxml2-win-binaries\libxml2\htmlparser.c,
3926)
etree!htmlParseElementInternal+00000069
(c:\tmp\libxml2-win-binaries\libxml2\htmlparser.c,
4467)
etree!htmlParseContentInternal+000003D3
(c:\tmp\libxml2-win-binaries\libxml2\htmlparser.c,
4652)
etree!htmlParseDocument+000002A2
(c:\tmp\libxml2-win-binaries\libxml2\htmlparser.c,
4818)
etree!htmlDoRead+00000094 (c:\tmp\libxml2-win-binaries\libxml2\htmlparser.c,
6786)
etree!htmlCtxtReadMemory+00000093
(c:\tmp\libxml2-win-binaries\libxml2\htmlparser.c,
7072)
etree!__pyx_f_4lxml_5etree_11_BaseParser__parseUnicodeDoc+0000028A
(c:\tmp\lxml-3.6.4\src\lxml\lxml.etree.c, 109222)
etree!__pyx_f_4lxml_5etree__parseDoc+0000041F
(c:\tmp\lxml-3.6.4\src\lxml\lxml.etree.c,
115220)
etree!__pyx_f_4lxml_5etree__parseMemoryDocument+000000E6
(c:\tmp\lxml-3.6.4\src\lxml\lxml.etree.c, 116674)
etree!__pyx_pf_4lxml_5etree_22fromstring+00000086
(c:\tmp\lxml-3.6.4\src\lxml\lxml.etree.c, 77737)
etree!__pyx_pw_4lxml_5etree_23fromstring+00000294
(c:\tmp\lxml-3.6.4\src\lxml\lxml.etree.c, 77687)
python33!PyCFunction_Call+000000F3
(c:\users\martin\33.amd64\python\objects\methodobject.c,
84)
python33!PyObject_Call+00000061
(c:\users\martin\33.amd64\python\objects\abstract.c,
2036)
python33!ext_do_call+00000295 (c:\users\martin\33.amd64\python\python\ceval.c,
4381)
python33!PyEval_EvalFrameEx+00002041
(c:\users\martin\33.amd64\python\python\ceval.c,
2723)
python33!PyEval_EvalCodeEx+0000065C
(c:\users\martin\33.amd64\python\python\ceval.c,
3436)
python33!function_call+0000015D
(c:\users\martin\33.amd64\python\objects\funcobject.c,
639)
python33!PyObject_Call+00000061
(c:\users\martin\33.amd64\python\objects\abstract.c,
2036)
python33!ext_do_call+00000295 (c:\users\martin\33.amd64\python\python\ceval.c,
4381)
python33!PyEval_EvalFrameEx+00002041
(c:\users\martin\33.amd64\python\python\ceval.c,
2723)
python33!PyEval_EvalCodeEx+0000065C
(c:\users\martin\33.amd64\python\python\ceval.c,
3436)
python33!fast_function+0000014D
(c:\users\martin\33.amd64\python\python\ceval.c,
4168)
python33!call_function+00000339
(c:\users\martin\33.amd64\python\python\ceval.c,
4088)
python33!PyEval_EvalFrameEx+00001F98
(c:\users\martin\33.amd64\python\python\ceval.c,
2681)

For reference, here are my version numbers:

Python              : sys.version_info(major=3, minor=3, micro=5,
releaselevel='final', serial=0)
lxml.etree          : (3, 6, 4, 0)
libxml used         : (2, 9, 4)
libxml compiled     : (2, 9, 4)
libxslt used        : (1, 1, 29)
libxslt compiled    : (1, 1, 29)

Should I open a new bug?

Thanks!

Benoit Bernard
https://benbernardblog.com

Re: [lxml] Memory leak when parsing XML files in sequence?

Benoit Bernard

Stefan Behnel

Benoit Bernard

Burak Arslan

Stefan Behnel

Benoit Bernard

Burak Arslan

tags

participants (3)