
Hi! Benoit Bernard schrieb am 23.11.2016 um 19:44:
Has there been any advancements regarding this memory leak?
I built the newest version of lxml (as well as its dependencies) and the problem is still there. I was able to track it down using umdh on Windows:
etree!xmlDictLookup+0000025E (c:\tmp\libxml2-win-binaries\libxml2\dict.c, 933) etree!xmlHashAddEntry3+00000053 (c:\tmp\libxml2-win-binaries\libxml2\hash.c, 532) etree!xmlHashAddEntry+00000014 (c:\tmp\libxml2-win-binaries\libxml2\hash.c, 377) etree!xmlAddID+0000011D (c:\tmp\libxml2-win-binaries\libxml2\valid.c, 2632) etree!xmlSAX2AttributeInternal+0000078A (c:\tmp\libxml2-win-binaries\libxml2\sax2.c, 1411) etree!xmlSAX2StartElement+000002AE (c:\tmp\libxml2-win-binaries\libxml2\sax2.c, 1743)
By default, lxml configures the parser to collect and remember IDs used in the documents. The dict that stores the names is shared globally in order to reduce overall memory consumption across documents. You can disable this for ID names by creating a parser with the option collect_ids=False. Stefan