[I18n-sig] XML and UTF-16
Tom Emerson
tree@basistech.com
Thu, 31 May 2001 17:35:30 -0400
Paul Prescod writes:
> I think so. UTF-32 is a 32-bit encoding and 32 bits are 4 bytes. You
> only need one character (either a BOM or a "<") sign to know what you
> are dealing with.
Well, you know that the first UTF-32 character is "<", but no
more. I'd at least look for "<?xml" to be absolutely sure, but I'm
also overly paranoid. You could be looking at "<!DOCTYPE" or some
such.
--
Tom Emerson Basis Technology Corp.
Sr. Sinostringologist http://www.basistech.com
"Beware the lollipop of mediocrity: lick it once and you suck forever"