On 23Jan2021 22:00, Stephen J. Turnbull
I see very little use in detecting the BOMs. I haven't seen a UTF-16 BOM in the wild in a decade (as usual for me, that's Japan-specific, and may be limited to the academic community as well), and the UTF-8 BOM is a no-op if the default is UTF-8 anyway.
I thought I'd seen them on Windows text files within the last year or so
(I don't use Windows often, so this is happenstance from receiving some
data, not an observation of the Windows ecosystem; my recollection is
that it was a UTF16 CSV file.)
But BOMs may be commonplace. This isn't a text file example, ut the
ISO14496 standard (the basis for all MOV and MP4 files) has a text field
type which may be UTF-16LE, UTF16BE or UTF-8, detected by a BOM of the
right flavour for UTF16 and not BOM implying UTF8. I'm sure this is to
accomodate easy writing by various systems.
I do not consider the BOM dead, and it is so cheap to recognise that not
bothering to do so seems almost mean sprited.
Cheers,
Cameron Simpson