ElementTree XML parsing problem
Stefan Behnel
stefan_ml at behnel.de
Thu Apr 28 01:57:28 EDT 2011
Hegedüs Ervin, 27.04.2011 21:33:
> hello,
>
>> I'm using ElementTree to parse an XML file, but it stops at the
>> second record (id = 002), which contains a non-standard ascii
>> character, ä. Here's the XML:
>>
>> <?xml version="1.0"?>
>> <snapshot time="Mon Apr 25 08:47:23 PDT 2011">
>> <records>
>> <record id="001" education="High School" employment="7 yrs" />
>> <record id="002" education="Universität Bremen" employment="3 years" />
>> <record id="003" education="River College" employment="5 yrs" />
>> </records>
>> </snapshot>
>>
>> The complaint offered up by the parser is
>
> I've checked this xml with your script, I think your locales
> settings are not good.
>
> $ ./parse.py
>
> XML file: test.xml
> 001 High School
> 002 Universität Bremen
> 003 River College
>
> (name of xml file is "test.xml")
>
> So, I started change the codepage mark of xml:
>
> <?xml version="1.0" encoding="UTF-8" ?> - same result
> <?xml version="1.0" encoding="ISO-8859-2" ?> - same result
> <?xml version="1.0" encoding="ISO-8859-1" ?> - same result
You probably changed this in an editor that supports XML and thus saves the
file in the declared encoding. Switching between the three by simply
changing the first line (the XML declaration) and not adapting the encoding
of the document itself would otherwise not yield the same result for the
document given above.
Stefan
More information about the Python-list
mailing list