Parsing XML with ElementTree (unicode problem?)

oren.tsur at gmail.com oren.tsur at gmail.com
Thu Jul 26 16:23:35 CEST 2007


On Jul 26, 3:13 pm, John Machin <sjmac... at lexicon.net> wrote:
> On Jul 26, 9:24 pm, oren.t... at gmail.com wrote:
>
> > OK, I solved the problem but I still don't get what went wrong.
> > Solution - use tree builder in order to create the new xml file
> > (previously I was  "manually" creating it).
>
> > I'm still curious so I'm adding a link to a short and very simple
> > script that gets an xml (containing non ascii chars) from the web and
> > saves some of the elements to 2 different local xml files - one is
> > created by XMLWriter and the other is created manually. you could see
> > that parsing of the first local file is OK while parsing of the
> > "manually" created xml file fails. obviously I'm doing something wrong
> > and I'd love to learn what.
>
> > the toy script:http://staff.science.uva.nl/~otsur/code/xmlConversions.py
>
> Simple file comparison:
>
> File 1: ... Modern Church.  <p>The book ...
> File 2: ... Modern Church.  <p>The book ...
>
> Firefox:
>
> XML Parsing Error: mismatched tag. Expected: </p>.
> Location: file:///C:/junk/myDeVinciCode166_2.xml
> Line Number 3, Column 1153:
>
> <CONTENT>The...Church.  <p>The...thrill.</CONTENT>
> ------------------------------------------^

yup, but why does this happen - on the script side - I write the exact
same strings, of content with supposedly, same encoding, so why the
encoding is different?




More information about the Python-list mailing list