On Sun, Jan 24, 2021 at 10:17 AM Guido van Rossum firstname.lastname@example.org wrote:
I have definitely seen BOMs written by Notepad on Windows 10.
Why can’t the future be that open() in text mode guesses the encoding?
I don't like guessing. As a Japanese, I have seen many mojibake caused by the wrong guess. I don't think guessing encoding is not a good part of reliable software.
On the other hand, if we add `open_utf8()`, it's easy to ignore BOM:
* When reading, use "utf-8-sig". (it can read UTF-8 without bom) * When writing, use "utf-8".
Although UTF-8 with BOM is not recommended, and Notepad uses UTF-8 without BOM as default encoding from 1903, UTF-8 with BOM is still used in some cases. For example, Excel reads CSV file with UTF-8 with BOM or legacy encoding. So some CSV files is written with BOM.