[XML-SIG] unicode, latin-1 and DOM...
Thomas B. Passin
tpassin@home.com
Thu, 28 Jun 2001 08:08:05 -0400
I tried out your example (by cut-and-paste) in XML Cooktop, which uses the
Microsoft parser. Sure enough, it didn't validate, complaining about the
illegal characters. If I added an xml declaration with
encoding='iso-8859-1", it was happy. Just what you'd expect. This was in
Windows98.
I've been hoping that the encoding issues would have gone silently away with
Python 2.0+, but I knew it wasn't really going to happen so easily!
Cheers,
Tom P
[Alexandre Fayolle]>
> I'm struggling with unicode and stuff (so expect some mails in the coming
> days). Here's the first one. I'm aware that the XML document being parsed
> in not correct (no encoding header), bug I'm surprised by the resut I get:
>
> >>> from xml.dom.ext.reader import Sax2
> >>> d = Sax2.FromXml('<d>été</d>')
> >>> from xml.dom.ext import PrettyPrint
> >>> PrettyPrint(d)
> <?xml version='1.0' encoding='UTF-8'?>
> <!DOCTYPE d>
> <d/>
> >>> d.documentElement
> <Element Node at 81b14c4: Name='d' with 0 attributes and 0 children>
>