[Expat-discuss] UTF-8 data

Kadakuntla, Pankaja pkadakuntla@axsone.com
Fri Oct 19 11:42:21 2001


This message is in MIME format. Since your mail reader does not understand
this format, some or all of this message may not be legible.

---------------------- multipart/alternative attachment
When I parse an XML file using Expat I get a Parse error
"XML_ERROR_INVALID_TOKEN". I see the line contains a special character
(0XC2). I read from the documentation that Expat parser uses UTF-8 as the
default encoding and has built in support for the following encodings.
*	utf-8 
*	utf-16 
*	iso-8859-1 
*	us-ascii 
I have two questions.
1. Is there any way, I could tell the parser to ignore such characters?
2. In the program that generates XML I need to validate the characters for a
specific encoding before I write them out to the XML file.  Has any one done
this before?  Can Expat internal functions be of any use to validate
characters for the encoding specified (Ex UTF-8, iso-8859-1 etc).

 

---------------------- multipart/alternative attachment
An HTML attachment was scrubbed...
URL: http://mail.libexpat.org/pipermail-21/expat-discuss/attachments/20011019/efab5b36/attachment.html

---------------------- multipart/alternative attachment--