UTF-8 question from Dive into Python 3

Alexander Kapps alex.kapps at web.de
Mon Jan 17 17:30:59 EST 2011


On 17.01.2011 23:19, carlo wrote:

> Is it true UTF-8 does not have any "big-endian/little-endian" issue
> because of its encoding method? And if it is true, why Mark (and
> everyone does) writes about UTF-8 with and without BOM some chapters
> later? What would be the BOM purpose then?

Can't answer your other questions, but the UTF-8 BOM is simply a 
marker saying "This is a UTF-8 text file, not an ASCII text file"

If I'm not wrong, this was a Microsoft invention and surely one of 
their brightest ideas. I really wish, that this had been done for 
ANSI some decades ago. Determining the encoding for text files is 
hard to impossible because such a mark was never introduced.



More information about the Python-list mailing list