Wrong default endianess in utf-16 and utf-32 !?

John Machin sjmachin at lexicon.net
Tue Oct 12 22:00:40 CEST 2010


jmfauth <wxjmfauth <at> gmail.com> writes:

> When an endianess is not specified, (BE, LE, unmarked forms),
> the Unicode Consortium specifies, the default byte serialization
> should be big-endian.
> 
> See http://www.unicode.org/faq//utf_bom.html
> Q: Which of the UTFs do I need to support?
> and
> Q: Why do some of the UTFs have a BE or LE in their label,
> such as UTF-16LE?

Sometimes it is necessary to read right to the end of an answer:

Q: Why do some of the UTFs have a BE or LE in their label, such as UTF-16LE?

A: [snip] the unmarked form uses big-endian byte serialization by default, but
may include a byte order mark at the beginning to indicate the actual byte
serialization used.




More information about the Python-list mailing list