Sax2 encoding

Alexandre Fayolle alf at
Fri Aug 30 13:44:20 CEST 2002

Dans l'article <mailman.1030702904.8805.python-list at>, 
Juan M. Casillas a écrit :
> Hello folks!
> I have an xml document that only begins with
><?xml version="1.0"?>
> [...]
> That is, without no info about the encoding. This document has special
> characters encoded in ISO-8859-1 format (spanish characters just like
> á, or ñ). 

Then your document is not well formed XML, and you will have big trouble
parsing it. It should begin with 
<?xml verions="1.0" encoding="iso-8859-1"?>

If you can't change this yourself, you should ask the author to do it.
And if he doesn't want, you should convert it to utf-8 using python's
codec module before parsing it.
> and poking arround the file, I found a 'á' character at this position.
> So my question is... how can I set the default encoding for the sax2
> reader so the XML parser works for me ?

The default encoding is UTF-8, because this is what the XML
specification mandates. You cannot change it. 

Alexandre Fayolle
LOGILAB, Paris (France).
Narval, the first software agent available as free software (GPL).

More information about the Python-list mailing list