[New-bugs-announce] [issue27899] Apostrophe is not replace with ' ElementTree.tostring (also in Element.write)
Israel Fruchter
report at bugs.python.org
Tue Aug 30 13:18:25 EDT 2016
New submission from Israel Fruchter:
Both on python2.7 and python3.4
>>> from xml.etree import cElementTree as ET
>>> text = '<end>its > < & '</end>'
>>> root = ET.fromstring(text.encode('utf-8'))
>>> ET.tostring(root, method="xml")
<end>its > < & '</end>
I would expected to return the same as the input to be a complient XML 1.0
I would understand why for html it would return something diffrent, see:
http://stackoverflow.com/questions/2083754/why-shouldnt-apos-be-used-to-escape-single-quotes
as a workaround I had to path ElementTree:
from xml.etree.ElementTree import _escape_cdata ,_raise_serialization_error
from mock import patch
def _escape_cdata(text):
# escape character data
try:
# it's worth avoiding do-nothing calls for strings that are
# shorter than 500 character, or so. assume that's, by far,
# the most common case in most applications.
if "&" in text:
text = text.replace("&", "&")
if "<" in text:
text = text.replace("<", "<")
if ">" in text:
text = text.replace(">", ">")
if "'" in text:
text = text.replace("'", "'")
return text
except (TypeError, AttributeError):
_raise_serialization_error(text)
from xml.etree import cElementTree as ET
text = '<end>its > < & '</end>'
root = ET.fromstring(text.encode('utf-8'))
with patch('xml.etree.ElementTree._escape_cdata', new=_escape_cdata):
s = ET.tostring(root, encoding='unicode', method="xml")
print(s)
----------
components: XML
messages: 273937
nosy: fruch
priority: normal
severity: normal
status: open
title: Apostrophe is not replace with ' ElementTree.tostring (also in Element.write)
type: behavior
versions: Python 2.7, Python 3.4
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue27899>
_______________________________________
More information about the New-bugs-announce
mailing list