[XML-SIG] Character encodings and expat
Martin v. Loewis
martin@loewis.home.cs.tu-berlin.de
Thu, 26 Oct 2000 01:12:28 +0200
> It looks like expat refuses the alias "latin1" for the encoding
> "ISO-8859-1" as it returns a fatalError that raises a SaxException when
> using
>
> Sax2.FromXml('<?xml version="1.0" encoding="latin1"?><try>איטש</try>')
>
> The XML spec says that parsers *may* recognize aliases defined by IANA, so
> I wouldn't call it a bug. Did I miss a parameter to set up somewhere to
> get expat to recognize "latin1" ?
Once xmlproc is capable of producing Unicode, it will certainly
understand all encodings that the Python 2.0 encoding machinery knows
of; that includes "latin1".
We should also strive for teaching expat to use the Python encoding
machinery, but that may be more difficult. Any volunteers?
If you *just* want it to recognize "latin1", you should extend
xmltok/xmltok.c:getEncodingIndex.
Regards,
Martin