[Expat-discuss] utf-8 encoding

Fred L. Drake, Jr. fdrake at acm.org
Tue Feb 24 23:23:31 EST 2004


On Tuesday 24 February 2004 09:06 pm, Karl Waclawek wrote:
 > If the character is two bytes in utf-8, then Expat must return two bytes.

In UTF-8 mode, yes.  In UTF-16 mode, it'll return some form of wide character.

 > I guess you are doing nothing wrong. Maybe Xerces is wrong?

Or Xerxes is returning wide characters, and the word "byte" doesn't really 
reflect what's in the sources.  We don't have enough information here to be 
sure, and I don't know the Xerxes interface, so can't speak to what it's 
actually doing.


  -Fred

-- 
Fred L. Drake, Jr.  <fdrake at acm.org>
PythonLabs at Zope Corporation




More information about the Expat-discuss mailing list