[issue41926] Unpredictable behavior when parsing xml. (xml.etree.ElementTree.iterparse)

Джон Смит report at bugs.python.org
Sun Oct 4 07:03:56 EDT 2020


New submission from Джон Смит <kibertitan at gmail.com>:

Data is lost when parsing large files.
I have prepared 5 test files for different cases.
With their help, I learned that losses are not accidental.
In example.xml, when going to iteration 717 (i = 717), the data is lost.

In the rest of the files, I learned that data loss occurs when the number of characters changes. It looks like some kind of buffer overflow.
In example5.xml I am using randomly generated data using a generator.py.

Several xml files have been prepared to show that this is not an error in the input data, but a problem in the library itself.

I tried to trace the cause of the occurrence, and came to the conclusion that the bug lies in the compiled file.

In the ElementTree.py library file, the line
"events = self._events_queue" 
returns an empty list. This can be seen at iteration 717 in example.xml.

----------
components: XML
messages: 377928
nosy: kibertitan
priority: normal
severity: normal
status: open
title: Unpredictable behavior when parsing xml. (xml.etree.ElementTree.iterparse)
type: behavior
versions: Python 3.8

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue41926>
_______________________________________


More information about the Python-bugs-list mailing list