Pyexpat and iso-8859-1

Marc Jeurissen mjeuris at lib.ua.ac.be
Tue Jun 13 08:32:07 EDT 2000


When I parse a XML-file containing the declaration '<?xml version="1.0"
encoding="iso-8859-1"?>' with Pyexpat, every iso-latin1 character is
being replaced by 2 new characters, the first of wich is nearly always
#195 (Ã), the second one has a decimal value of 64 less than the
original character value.

Some examples:

#233 (é) becomes #195 + #169 (©)
#231 (ç) becomes #195 + #167 (§)
#239 (ï) becomes #195 + #175 (¯)

Anyone knows what to do about this?

Thank you.
  
-- 
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Marc Jeurissen
University of Antwerp - Library Automation
Universiteitsplein 1, 2610 Wilrijk, Belgium
Tel   : +32 3 820 21 53 
Fax   : +32 3 820 21 59
E-mail: mjeuris at lib.ua.ac.be 
WWW   : http://www.ua.ac.be/ualib.html
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=



More information about the Python-list mailing list