[Python-ideas] PEP 540: Add a new UTF-8 mode

Victor Stinner victor.stinner at gmail.com
Fri Jan 6 06:24:49 EST 2017


2017-01-06 8:21 GMT+01:00 INADA Naoki <songofacandy at gmail.com>:
> I want UTF-8 mode is enabled by default (opt-out option) even if
> locale is not POSIX,
> like `PYTHONLEGACYWINDOWSFSENCODING`.

You do, I don't :-)

It shouldn't be hard to find very concrete issues from the mojibake
issues described at:
https://www.python.org/dev/peps/pep-0540/#expected-mojibake-issues

IMHO there are 3 steps before being able to reach your dream:

1) add opt-in support for UTF-8
2) use UTF-8 if the locale is POSIX
3) UTF-8 is enabled by default

I would prefer to begin with a first Python release at stage (1) or
(2), wait for user complains, and later decide if we can move to (3).

Right now, I didn't implement the PEP 540, so I wasn't able to
experiment anything in practice yet.

Well, at least it means that I have to elaborate the "Always use
UTF-8" alternative of my PEP to explain why I consider that we are not
ready to switch directly to his "obvious" option.


> Users depends on locale know what locale is and how to configure it.

It's not a matter of users, but a matter of code in the wild which
uses directly C functions like mbstowcs() or wsctombs(). These
functions use the current locale encoding, they are not aware of the
new Python UTF-8 mode.


> But many people lives in "UTF-8 everywhere" world, and don't know about locale.

The PEP 540 was written to help users for very concrete cases. I'm
repeating since Python 3.0 that users must learn how to configure
their locale. Well, 8 years later, I keep getting exactly the same
user complains: "Python doesn't work, it must just work!".

It's really hard to decode bytes and later encode the text and
prevenet any kind of encoding error. That's why no solution was
proposed before.


> `-X utf8` option should be parsed before converting commandline (...)

Yeah, that's a though technical issue. I'm not sure right know how to
implement this with a clean design. Maybe I will just try with a hack?
:-)

Victor


More information about the Python-ideas mailing list