[ expat-Bugs-481609 ] Wrong umlauts after parsing

noreply@sourceforge.net noreply@sourceforge.net
Wed Nov 14 16:05:02 2001


Bugs item #481609, was opened at 2001-11-14 00:33
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=110127&aid=481609&group_id=10127

Category: XML::Parser (Perl module)
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Thomas Frings (frings)
Assigned to: Clark Cooper (coopercc)
Summary: Wrong umlauts after parsing

Initial Comment:
Parsing a xml-file that contains german umlauts like 
   or their encoding like ä ä or ü
results in 'C$' (instead of ''), 'C<' (instead of '') 
or  'C6' (instead of '').

What's going wrong? 

System: Solaris 2.8
        expat 1.95.2
        XML-Parser 2.30

----------------------------------------------------------------------

Comment By: Simon Gordon (si_gordon)
Date: 2001-11-14 16:03

Message:
Logged In: YES 
user_id=227124

I believe this is UTF-8. Expat always outputs in UTF-8 
rather than either (a) what you want or (b) what the XML 
encoding is set to.

I have long-held the belief that this is a bug even though 
the relese notes for 1.95 documented this fact. I had to 
patch my version to output ISO-8859-1 for exactly the same 
reason - I needed umlauted characters in ISO, not UTF-8.

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=110127&aid=481609&group_id=10127