writing \feff at the begining of a file
tjreedy at udel.edu
Sat Aug 14 00:25:46 CEST 2010
A short background to MRAB's answer which I will try to get right.
The byte-order-mark was invented for UTF-16 encodings so the reader
could determine whether the pairs of bytes are in little or big endiean
order, depending on whether the first two bute are fe and ff or ff and
fe (or maybe vice versa, does not matter here). The concept is
meaningless for utf-8 which consists only of bytes in a defined order.
This is part of the Unicode standard.
However, Microsoft (or whoever) re-purposed (hijacked) that pair of
bytes to serve as a non-standard indicator of utf-8 versus any
non-unicode encoding. The result is a corrupted utf-8 stream that python
accommodates with the utf-8-sig(nature) codec (versus the standard utf-8
Terry Jan Reedy
More information about the Python-list