[XML-SIG] [URGENT] Problem with accent char
Olivier Deckmyn
Olivier Deckmyn" <odeckmyn@teaser.fr
Wed, 10 Jan 2001 12:09:26 +0100
Hi all,
Looks like parser modifies my content :(
I have the following "xml" string :
"""
<?xml version="1.0" encoding="iso-8859-1"?>
<Xafp type="multimedia" uno="afp_wbs_doc_010110105314.g5kw25ak">
<Head>
<Name>GB-OTAN-santé</Name>
<DateReleased>20010110T105314Z</DateReleased>
<Source>AFP</Source>
</Head>
<NewsLines>
<HeadLine>La polémique loin d'être apaisée par l'annonce de tests à
Londres</HeadLine>
<DateLine>LONDRES</DateLine>
</NewsLines>
</Xafp>
"""
One can notice that there are accents chars (iso-8859-1) inside <Name> or
<HeadLine> tags ; with a well defined encoding value in header...
If I parse this string (using Sax2.FromXml(...), getElementsByTagName() and
nodes[0].firstChild.nodeValue) ; the <Headline> tag content becomes :
"""
La pol\303\251mique loin d'\303\252tre apais\303\251e par l'annonce de tests
\303\240 Londres
"""
Looks like there has been a unicode (utf-8 ?) conversion ...
What can I do, not to have this conversion made ? I don't want the parser to
modify my content !!!!
Thanx for your support...
I've tried with py-xml 0.5.1 and 0.6.2
I use python 1.5.2 under FreeBSD 4.2
My imports (might help ?):
from xml import dom
from xml.dom.ext.reader import Sax2
from xml.dom import ext
from xml.dom.Node import Node
Thanx again,
Olivier.
---
We are Micro$oft. You will be assimilated. Resistance is futile.