[Expat-discuss] utf-8 encoding
Karl Waclawek
karl at waclawek.net
Tue Feb 24 21:06:04 EST 2004
----- Original Message -----
From: <Aruna.Bhaskara at wellsfargo.com>
> I am trying to use utf-8 encoding my input file has some multibyte
> character like below . If I parse it through expat and print the output
> to a file I see two bytes
>
> Shouldn't it be single byte or since its utf-8 encoding it represent as
> two bytes and the progreammer has to take care of interpreting the 2 bytes.
If the character is two bytes in utf-8, then Expat must return two bytes.
> If I use the xerces parser I see one byte being returned. Let me know
> what I am doing wrong.
I guess you are doing nothing wrong. Maybe Xerces is wrong?
Karl
More information about the Expat-discuss
mailing list