[ expat-Bugs-531936 ] not supported windows-1251 character set

noreply@sourceforge.net noreply@sourceforge.net
Fri Apr 19 14:56:14 2002


Bugs item #531936, was opened at 2002-03-19 11:56
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=110127&aid=531936&group_id=10127

Category: XML::Parser (Perl module)
Group: Feature Request
>Status: Closed
>Resolution: Rejected
Priority: 5
Submitted By: Nobody/Anonymous (nobody)
Assigned to: Clark Cooper (coopercc)
Summary: not supported windows-1251 character set

Initial Comment:
Hello.
I have found a problem with parse XML documents, that 
contains cyrilic symbols in attribute value
something like 
<test><msg v="ôûâà" /></test> 
raises exceprion
"not well-formed ....."

i test all cyrilic characters and found that exception 
raises when character code more than 0xF0 but i can be 
at fault

i try use .enc file, but it's not work in my case

I think that it is a bug. 

----------------------------------------------------------------------

>Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2002-04-19 17:55

Message:
Logged In: YES 
user_id=3066

This is not a bug.  Your sample XML does not include an XML
declaration, so it will be treated as UTF-8.  If you need it
to be treated differently, include an XML declaration with
the proper encoding pseudo-attribute.

Expat support UTF-8, UTF-16, ISO-8859-1, and US-ASCII
encodings natively; you can extend the set of supported
encodings using the XML_SetUnknownEncodingHandler() function.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=110127&aid=531936&group_id=10127