[issue6233] ElementTree (py3k) doesn't properly encode characters that can't be represented in the specified encoding
Fredrik Lundh
report at bugs.python.org
Sun Jun 21 23:42:04 CEST 2009
Fredrik Lundh <fredrik at effbot.org> added the comment:
Did you look at the 1.3 alpha code base when you came up with this idea?
Unfortunately, 1.3's _encode is used for a different purpose...
I don't have time to test it tonight, but I suspect that 1.3's
escape_data/escape_attrib functions might work better under 3.X; they do
the text.replace dance first, and then an explicit text.encode(encoding,
"xmlcharrefreplace") at the end. E.g.
def _escape_cdata(text, encoding):
# escape character data
try:
# it's worth avoiding do-nothing calls for strings that are
# shorter than 500 character, or so. assume that's, by far,
# the most common case in most applications.
if "&" in text:
text = text.replace("&", "&")
if "<" in text:
text = text.replace("<", "<")
if ">" in text:
text = text.replace(">", ">")
return text.encode(encoding, "xmlcharrefreplace")
except (TypeError, AttributeError):
_raise_serialization_error(text)
----------
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue6233>
_______________________________________
More information about the Python-bugs-list
mailing list