[New-bugs-announce] [issue35502] Memory leak in xml.etree.ElementTree.iterparse

Jess Johnson report at bugs.python.org
Fri Dec 14 14:59:00 EST 2018


New submission from Jess Johnson <jess at grokcode.com>:

When given xml that that would raise a ParseError, but parsing is stopped before the ParseError is raised, xml.etree.ElementTree.iterparse leaks memory.

Example:


import gc
from io import StringIO
import xml.etree.ElementTree as etree

import objgraph


def parse_xml():
    xml = """
      <LEVEL1>
      </LEVEL1>
    </ROOT>
    """
    parser = etree.iterparse(StringIO(initial_value=xml))
    for _, elem in parser:
        if elem.tag == 'LEVEL1':
            break


def run():
    parse_xml()

    gc.collect()
    uncollected_elems = objgraph.by_type('Element')
    print(uncollected_elems)
    objgraph.show_backrefs(uncollected_elems, max_depth=15)


if __name__ == "__main__":
    run()


Output:
[<Element 'LEVEL1' at 0x10df712c8>]

Also see this gist which has an image showing the objects that are retained in memory: https://gist.github.com/grokcode/f89d5c5f1831c6bc373be6494f843de3

----------
components: XML
messages: 331861
nosy: jess.j
priority: normal
severity: normal
status: open
title: Memory leak in xml.etree.ElementTree.iterparse
type: resource usage
versions: Python 3.7

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue35502>
_______________________________________


More information about the New-bugs-announce mailing list