Parsing XML with ElementTree (unicode problem?)

oren.tsur at oren.tsur at
Tue Jul 24 07:57:26 CEST 2007

On Jul 23, 4:46 pm, "Richard Brodie" <R.Bro... at> wrote:
> <oren.t... at> wrote in message
> news:1185200976.082516.105420 at
> > so what's the difference? how comes parsing is fine
> > in the first case but erroneous in the second case?
> You may have guessed the encoding wrong. It probably
> wasn't utf-8 to start with but iso8859-1 or similar.
> What actual byte value is in the file?

I tried it with different encodings and it didn't work. Anyways, I
would expect it to be utf-8 since the XML response to the amazon query
indicates a utf-8 (check it with

 in your browser, the first line in the source is <?xml version="1.0"

but the thing is that the parser parses it all right from the web (the
amazon response) but fails to parse the locally saved file.

> > 2. there is another problem that might be similar I get a similar
> > error if the content of the (locally saved) xml have special
> > characters such as '&'
> Either the originator of the XML has messed up, or whatever
> you have done to save a local copy has mangled it.

I think i made a mess. I changed the '&' in the original response to
'and' because the parser failed to parse the '&' (in the locally saved
file) just like it failed with the French characters. Again, parsing
the original response was just fine.

Thanks again,


More information about the Python-list mailing list