On Mon, Jan 25, 2021 at 8:51 PM Inada Naoki <songofacandy@gmail.com> wrote:
On Tue, Jan 26, 2021 at 10:22 AM Guido van Rossum <guido@python.org> wrote:
> Older Pythons may be easy to drop, but I'm not so sure about older unofficial docs. The open() function is very popular and there must be millions of blog posts with examples using it, most of them reading text files (written by bloggers naive in Python but good at SEO).
>
> I would be very sad if the official recommendation had to become "[for the most common case] avoid open(filename), use open_text(filename)".

I agree that. But until we switch to the default encoding of open(),
we must recommend to avoid `open(filename)` anyway.
The default encoding of VS Code, Atom, Notepad is already UTF-8.

Maybe we're overthinking this - do we really need to recommend avoiding `open(filename)` in all cases? Isn't it just fine to use if `locale.getpreferredencoding(False)` is UTF-8, since in that case there won't be any change in behavior when `open` switches from the old, locale-specific default to the new, always UTF-8 default?

If that's the case, then it would be less of a backwards incompatibility issue, since most production environments will already be using UTF-8 as the locale (by virtue of it being the norm on Unix systems and servers).

And if that's the case, all we need is a warning that is raised conditionally when open() is called for text mode without an explicit encoding when the system locale is not UTF-8, and that warning can say something like:

Your system is currently configured to use shift_jis for text files.
Beginning in Python 3.13, open() will always use utf-8 for text files instead.
For compatibility with future Python versions, pass open() the extra argument:
    encoding="shift_jis"

~Matt