[XML-SIG] Re: UnicodeError: ASCII encoding error: ordinal not inrange(128)

Walter Dörwald walter at livinglogic.de
Wed Nov 26 07:06:48 EST 2003


Fredrik Lundh wrote:

> David Gaya wrote:
> 
>>UnicodeError: ASCII encoding error: ordinal not in range(128)
> 
> note that this is generated by the print statement, not the XML library.
> 
>>Instead of 'print xmldoc.toxml()' I've tryed
>>   s = xmldoc.toxml()
>>   print s.encode('UTF-8')
>>But then  is encoded in a single character.
>>Does anybody know how can I print (or save) back the modified document
>>keeping the  format ?
> 
> if you're using Python 2.3, you can use the "xmlcharrefreplace"
> error handler:
> 
>     print xmldoc.toxml().encode("ascii", "xmlcharrefreplace")
> 
>     <?xml version="1.0" ?>
>     <StyleBox>
>        <String>&#63482;</String>
>     </StyleBox>

But this will escape characters even inside comments or processing
instructions.

> the toxml method takes an encoding attribute, but that doesn't do
> anything even remotely useful in this case:
> 
> print xmldoc.toxml(encoding='us-ascii')
> 
> Traceback (most recent call last):
>     | snip |
>   File "C:\python23\lib\xml\dom\minidom.py", line 303, in _write_data
>     writer.write(data)
>   File "C:\python23\lib\codecs.py", line 178, in write
>     data, consumed = self.encode(object, self.errors)
> UnicodeEncodeError: 'ascii' codec can't encode character u'\uf7fa' in position
> 0: ordinal not in range(128)

PyXML's xml.sax.saxutils.XMLGenerator will do the correct thing (i.e.
escape characters in text nodes, but raise an error for unencodable
characters elsewhere.

Are there any plans to add the same functionality to minidom?

Bye,
    Walter Dörwald





More information about the XML-SIG mailing list