[Expat-bugs] [ expat-Bugs-600479 ] error decoding UTF-8 triplet

noreply@sourceforge.net noreply@sourceforge.net
Mon, 26 Aug 2002 14:34:42 -0700


Bugs item #600479, was opened at 2002-08-26 14:34
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=110127&aid=600479&group_id=10127

Category: www.libexpat.org
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Nobody/Anonymous (nobody)
Assigned to: Fred L. Drake, Jr. (fdrake)
Summary: error decoding UTF-8 triplet

Initial Comment:
On Windows, when reading the UTF-8 sequence "EF 
BA BF", utf8_isInvalid3 returns TRUE, when it should 
return FALSE. This UTF-8 sequence encodes to "FEBF" 
as UCS-2 (Unicode), but as a result of utf8_isInvalid3 
returning TRUE, an error results and the character isn't 
decoded properly.

This is using expat 1.95.4.

Attached is a simple XML file which illustrates the 
problem.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=110127&aid=600479&group_id=10127