[New-bugs-announce] [issue7139] Incorrect serialization of end-of-line characters in attribute values
Moriyoshi Koizumi
report at bugs.python.org
Thu Oct 15 08:21:30 CEST 2009
New submission from Moriyoshi Koizumi <mozo+python at mozo.jp>:
ElementTree doesn't correctly serialize end-of-line characters (#xa,
#xd) in attribute values. Since bare end-of-line characters are
converted to #x20 by the parser according to the specification [1], such
characters that are represented as character references in the original
document must be serialized in the same form.
[1] http://www.w3.org/TR/xml11/#AVNormalize
### sample code
from xml.etree.ElementTree import ElementTree
from cStringIO import StringIO
# builder = ElementTree(file=StringIO("<foo>\x0d</foo>"))
# out = StringIO()
# builder.write(out)
# print out.getvalue()
out = StringIO()
ElementTree(file=StringIO(
'''<?xml version="1.0" encoding="UTF-8" ?>
<!DOCTYPE foo [
<!ELEMENT foo (#PCDATA)>
<!ATTLIST foo attr CDATA "">
]>
<foo attr=" test
test  test a "> </foo>
''')).write(out)
# should be "<foo attr=" test test test a ">\x0a</foo>
print out.getvalue()
out = StringIO()
ElementTree(file=StringIO(
'''<?xml version="1.0" encoding="UTF-8" ?>
<!DOCTYPE foo [
<!ELEMENT foo (#PCDATA)>
<!ATTLIST foo attr NMTOKENS "">
]>
<foo attr=" test
test  test a "> </foo>
''')).write(out)
# should be "<foo attr="test test test a">\x0a</foo>
print out.getvalue()
----------
components: XML
messages: 94074
nosy: moriyoshi
severity: normal
status: open
title: Incorrect serialization of end-of-line characters in attribute values
type: behavior
versions: Python 2.6
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue7139>
_______________________________________
More information about the New-bugs-announce
mailing list