[I18n-sig] XML and UTF-16
Paul Prescod
paulp@ActiveState.com
Thu, 31 May 2001 14:44:37 -0700
Tom Emerson wrote:
>
> Paul Prescod writes:
> > I think so. UTF-32 is a 32-bit encoding and 32 bits are 4 bytes. You
> > only need one character (either a BOM or a "<") sign to know what you
> > are dealing with.
>
> Well, you know that the first UTF-32 character is "<", but no
> more. I'd at least look for "<?xml" to be absolutely sure, but I'm
> also overly paranoid. You could be looking at "<!DOCTYPE" or some
> such.
Would it matter if you were looking at <!DOCTYPE? Anyhow, a UTF-32
document without an XML declaration would be in error. The declaration
is required for everything other than UTF-8 and UTF-16.
--
Take a recipe. Leave a recipe.
Python Cookbook! http://www.ActiveState.com/pythoncookbook