On Thu, Feb 11, 2021 at 4:44 PM Jim J. Jewett
The PEP helps when the locale is ASCII or C, but that isn't enforced in actual files. I am confident that this is a frequent problem for packages downloaded from mostly-English sites, including many software repositories.
The PEP helps developers living on UTF-8 locale to find missing `encoding="utf-8"` bug. This type of bug is very common, and many Windows users are suffered by the bug when reading JSON, YAML, TOML, Markdown, or any other UTF-8 files.
It does not seem to be a win when the locale is something incompatible with utf-8, such as Latin-1, or whatever is still common in Japan. The surrogate-escape mechanism allows a proper round-trip, but python itself will stop processing the characters correctly.
Surrogate-escape mechanism doesn't relating this PEP.
For interactive use, when talking to another program (such as a terminal) instead of an already existing file, the backwards compatibility problem seems worse.
This PEP is 100% backward compatible.
Changing the default to utf-8 (after a deprecation period showing how to make locale an explicit default) may be reasonable, but claiming that it is backwards compatible ... I didn't get that impression from the PEP.
This PEP doesn't propose to change the default encoding.
*If* we decide to change the default encoding in the future (maybe,
2025 or later) and start emitting DeprecationWarning where `encoding`
option is omitted, this PEP help it by:
* `encoding="locale"` option can be used since Python 3.10, and
* The number of DeprecationWarning shown is decreased because we can
add `encoding="utf-8"` many places before the time. At least, we can
fix all EncodingWarning in stdlib.
Maybe, the "Prepare to change the default encoding to UTF-8" is misleading.
I will try to fix the section or remove the section.
--
Inada Naoki