[XML-SIG] unicode, latin-1 and DOM...

Thomas B. Passin tpassin@home.com
Thu, 28 Jun 2001 08:08:05 -0400


I tried out your example (by cut-and-paste) in XML Cooktop, which uses the
Microsoft parser.  Sure enough, it didn't validate, complaining about the
illegal characters.  If I added an xml declaration with
encoding='iso-8859-1", it was happy.  Just what you'd expect.  This was in
Windows98.

I've been hoping that the encoding issues would have gone silently away with
Python 2.0+, but I knew it wasn't really going to happen so easily!

Cheers,

Tom P

[Alexandre Fayolle]>
> I'm struggling with unicode and stuff (so expect some mails in the coming
> days). Here's the first one. I'm aware that the XML document being parsed
> in not correct (no encoding header), bug I'm surprised by the resut I get:
>
> >>> from xml.dom.ext.reader import Sax2
> >>> d = Sax2.FromXml('<d>été</d>')
> >>> from xml.dom.ext import PrettyPrint
> >>> PrettyPrint(d)
> <?xml version='1.0' encoding='UTF-8'?>
> <!DOCTYPE d>
> <d/>
> >>> d.documentElement
> <Element Node at 81b14c4: Name='d' with 0 attributes and 0 children>
>