[New-bugs-announce] [issue43703] xml.etree parser does not accept valid control characters

Romuald Brunet report at bugs.python.org
Fri Apr 2 07:03:35 EDT 2021


New submission from Romuald Brunet <romuald.brunet at gmail.com>:

Python XML parser (xml.etree) does not seems to allow control characters that are invalid in XML 1.0, but valid in XML 1.1 [1] [2]


Considering the following sample:


import xml.etree.ElementTree as ET

bad = '<?xml version="1.1"?><foo>bar &#x19; baz</foo>'
print(ET.fromstring(bad))


The parser raises the following error:
ParseError: reference to invalid character number: line 1, column 30



[1] https://www.w3.org/TR/xml11/Overview.html#charsets
[2] https://www.w3.org/TR/xml11/Overview.html#sec-xml11

----------
components: XML
messages: 390050
nosy: Romuald
priority: normal
severity: normal
status: open
title: xml.etree parser does not accept valid control characters
versions: Python 3.9

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue43703>
_______________________________________


More information about the New-bugs-announce mailing list