[XML-SIG] [ pyxml-Patches-615114 ] saxutils.py: CharRef escaping
noreply@sourceforge.net
noreply@sourceforge.net
Thu, 26 Sep 2002 11:31:47 -0700
Patches item #615114, was opened at 2002-09-26 20:31
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=306473&aid=615114&group_id=6473
Category: SAX
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Carsten Oberscheid (oberscheid)
Assigned to: Nobody/Anonymous (nobody)
Summary: saxutils.py: CharRef escaping
Initial Comment:
saxutils.XMLGenerator selects a codec for output according to
the encoding argument given to its constructor. All output is written
through this codec, and any character in the data that doesn't fit the
selected encoding raises a UnicodeError.
The patch adds
a cr_escape() function that replaces all characters with codes >
127 by XML character references. So the output encoding can be
selected independent from the actual characters in the
document.
This is done for character data and for attribute
values, where CharRefs are allowed. It is not done for element
names, attribute names etc., where CharRefs are not allowd
(although there can be non-ASCII-characters, as well -- these still
have to fit the output encoding).
It's a brute force thing, it can
be slow, but it should do what it's supposed to do. Walter Dörwald
pointed out that PEP 239 should deprecate this for Python 2.3, but
for Python < 2.3 it may be useful.
It's my first patch, so if
there's anything wrong with it, give me a chance to learn and tell me.
If there's a better way to do it (I'm sure, there is), ditto.
Nearly
forgot: Patch against saxutils.py from 0.8.1, but I checked the
CVS version and it seemed to be unchanged.
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=306473&aid=615114&group_id=6473