[Expat-discuss] UTF-8 data
Kadakuntla, Pankaja
pkadakuntla@axsone.com
Fri Oct 19 11:42:21 2001
This message is in MIME format. Since your mail reader does not understand
this format, some or all of this message may not be legible.
---------------------- multipart/alternative attachment
When I parse an XML file using Expat I get a Parse error
"XML_ERROR_INVALID_TOKEN". I see the line contains a special character
(0XC2). I read from the documentation that Expat parser uses UTF-8 as the
default encoding and has built in support for the following encodings.
* utf-8
* utf-16
* iso-8859-1
* us-ascii
I have two questions.
1. Is there any way, I could tell the parser to ignore such characters?
2. In the program that generates XML I need to validate the characters for a
specific encoding before I write them out to the XML file. Has any one done
this before? Can Expat internal functions be of any use to validate
characters for the encoding specified (Ex UTF-8, iso-8859-1 etc).
---------------------- multipart/alternative attachment
An HTML attachment was scrubbed...
URL: http://mail.libexpat.org/pipermail-21/expat-discuss/attachments/20011019/efab5b36/attachment.html
---------------------- multipart/alternative attachment--