[I18n-sig] UTF-8 and BOM

Guido van Rossum guido@digicool.com
Wed, 16 May 2001 14:55:35 -0500


> >  3) I think that distinguising UTF-8 from other encodings through the
> > BOM is actually a great idea and I wish that every UTF-8 creator would
> > do it!
> 
> Uhm, I can't follow you here... BOMs in UTF-8 look like this:
> 
> >>> u'\ufeff'.encode('utf-8')
> '\xef\xbb\xbf'
> 
> which is somewhat different from '\xff\xfe' or '\xfe\xff'.

I think he meant that this serves as a sort-of "magic number" for
UTF-8 files.  I find that kind of cute myself. :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)