[Expat-discuss] Character encoding ISO 8859-1
Henrik Eriksson
henrik.eriksson@axis.com
Tue, 3 Jul 2001 13:09:16 +0200
Hi
> -----Original Message-----
> From: Christian Wischhusen [mailto:wischhusen@web.de]
> Sent: Tuesday, July 03, 2001 12:50 PM
>
> Hi,
> I'm using expat with ISO 8859-1 encoded xml files and I have
> following problem: expat converts german characters e.g.
>
> ß (Small sharp s, German (sz ligature) ("ß"))
> or
> ü (Small u, dieresis or umlaut mark ("ü"))
>
> to a sequence of two bytes, e.g.
> ß (sz) -> 0xC39F
> ü (small u, dieresis) -> 0xC3BC
This is quite correct; expat uses UTF-8 encoding in the callbacks
and the sequences above are UTF-8 encodings of the ISO 8859-1
characters ü and ß.
> As I use expat for german language I expect from expat that
> expat doesn't modify the character data between xml elements.
> Do anybody have a suggestion to solve my problem?
As said above, expat uses UTF-8 in the callbacks. I don't think
there is any way to change this.
>
> Chris
Best regards,
Henrik Eriksson