[ python-Bugs-938076 ] XMLGenerator ignores encoding in output

SourceForge.net noreply at sourceforge.net
Tue Apr 20 15:47:16 EDT 2004


Bugs item #938076, was opened at 2004-04-19 19:18
Message generated for change (Settings changed) made by loewis
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=938076&group_id=5470

Category: XML
Group: Python 2.3
Status: Open
Resolution: None
>Priority: 7
Submitted By: Magnus Lie Hetland (mlh)
>Assigned to: Martin v. Löwis (loewis)
Summary: XMLGenerator ignores encoding in output

Initial Comment:
When XMLGenerator is supplied with an encoding such as
'utf-8' and subsequently with some non-ASCII Unicode
characters, it crashes, because of its characters()
method. The current version is:

def characters(self, content):
    self._out.write(escape(content))

This completely ignores the encoding, and will (when
writing to something such as a StringIO or the like)
simply try to convert this into an ASCII string. The
encoding is only used in the XML header, not as the
real encoding!

It may be that I've gotten things wrong, but I would
suggest the following fix:

def characters(self, content):
    self._out.write(escape(content).encode(self._encoding))

This seems to work well for me, at least.

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2004-04-20 21:46

Message:
Logged In: YES 
user_id=21627

In general, it would be even better to generate character
references for characters not representable in the output
encoding.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=938076&group_id=5470



More information about the Python-bugs-list mailing list