Nicholas Bastin wrote:
It would be nice if you could optionally specify that the codec would assume UTF-16BE if no BOM was present, and not raise UnicodeError in that case, which would preserve the current behaviour as well as allow users' to ask for behaviour which conforms to the standard.
Alternatively, the UTF-16BE codec could support the BOM, and do UTF-16LE if the "other" BOM is found.
This would also support your usecase, and in a better way. The Unicode assertion that UTF-16 is BE by default is void these days - there is *always* a higher layer protocol, and it more often than not specifies (perhaps not in English words, but only in the source code of the generator) that the default should by LE.