[XML-SIG] Errors when using PrettyPrint Class (xml.dom.ext) and latin-1 characters (iso-8859-1) ...
mike.williams at globalgraphics.com
Mon Dec 12 18:27:23 CET 2005
Michel Charest did utter on 12/12/2005 15:51:
> COMMENT: As can be seen, when using Method1 (default encoding with
> iso8859-1, I get
> a UnicodeDecodeError. And, when using Metho2, explicitely encoding using
> unicode("élève", 'latin-1'), the PrettyPrint class does not raise an
> exception, but
> it garbles (does not correctly interpret) my latin-1 string (i.e. élève).
The xml.dom.ext PrettyPrint can only handle 7-bit ASCII or Unicode
encoded text node strings. Method1 will fail as your latin1 encoded
string contains 8-bit values which are not valid utf-8 encodings, as the
error message reports. Method2 is in fact working - the output you see
is the utf-8 encoding for your text node string. If you look at the
output generated in a Unicode editor you should see your original string.
> EXTRA DETAILS:
> * Running on Windows XP (sp2)
> * Python 2.4.2
> * PyXML 0.8.4
> * 4Suite 1.0b1
> * I have tried many other encoding formats such as utf8, utf-16, utf16-le,
> etc. with no luck !
The xml.dom.ext PrettyPrint can only produce utf-8 output. There is a
bug report and patch on sourceforge to let it produce utf-16 output.
I was just getting used to yesterday when today came.
More information about the XML-SIG