[XML-SIG] Character encodings and expat

Martin v. Loewis martin@loewis.home.cs.tu-berlin.de
Thu, 26 Oct 2000 01:12:28 +0200


> It looks like expat refuses the alias "latin1" for the encoding
> "ISO-8859-1" as it returns a fatalError that raises a SaxException when
> using
> 
> Sax2.FromXml('<?xml version="1.0" encoding="latin1"?><try>איטש</try>')
> 
> The XML spec says that parsers *may* recognize aliases defined by IANA, so
> I wouldn't call it a bug. Did I miss a parameter to set up somewhere to
> get expat to recognize "latin1" ?

Once xmlproc is capable of producing Unicode, it will certainly
understand all encodings that the Python 2.0 encoding machinery knows
of; that includes "latin1".

We should also strive for teaching expat to use the Python encoding
machinery, but that may be more difficult. Any volunteers?

If you *just* want it to recognize "latin1", you should extend
xmltok/xmltok.c:getEncodingIndex.

Regards,
Martin