[issue41926] Unpredictable behavior when parsing xml. (xml.etree.ElementTree.iterparse)
Джон Смит
report at bugs.python.org
Sun Oct 4 07:03:56 EDT 2020
New submission from Джон Смит <kibertitan at gmail.com>:
Data is lost when parsing large files.
I have prepared 5 test files for different cases.
With their help, I learned that losses are not accidental.
In example.xml, when going to iteration 717 (i = 717), the data is lost.
In the rest of the files, I learned that data loss occurs when the number of characters changes. It looks like some kind of buffer overflow.
In example5.xml I am using randomly generated data using a generator.py.
Several xml files have been prepared to show that this is not an error in the input data, but a problem in the library itself.
I tried to trace the cause of the occurrence, and came to the conclusion that the bug lies in the compiled file.
In the ElementTree.py library file, the line
"events = self._events_queue"
returns an empty list. This can be seen at iteration 717 in example.xml.
----------
components: XML
messages: 377928
nosy: kibertitan
priority: normal
severity: normal
status: open
title: Unpredictable behavior when parsing xml. (xml.etree.ElementTree.iterparse)
type: behavior
versions: Python 3.8
_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue41926>
_______________________________________
More information about the Python-bugs-list
mailing list