[Python-Dev] PEP 540: Add a new UTF-8 mode

Victor Stinner victor.stinner at gmail.com
Tue Dec 5 17:50:57 EST 2017


Chris:
> I just took another look at 538 -- and yes, the relationship between the two
> is really unclear. In particular, with 538, why do we need 540? I honestly
> don't know.

The PEP 538 only impacts platforms which provide the C.UTF-8 locale or
a variant: only a few recent Linux distribution. I know Fedora, maybe
a few other have it? FreeBSD and macOS are completely ignored by the
PEP 538. The PEP 540 uses the UTF-8 encoding for the POSIX locale on
*all* platforms.

Moreover, the PEP 538 only concerns the POSIX locale (locale "C"),
whereas the PEP 540 is usable with any locale. For example, using the
"fr_FR.iso88591" locale, the encoding is Latin1. But if you enable the
UTF-8 mode with this locale, Python will use UTF-8.

The other difference is that the PEP 538 is implemented with
setlocale(LC_CTYPE, "C.UTF-8"), whereas the PEP 540 is implemented in
Python internals and ignores the locale. The PEP 540 scope is limited
to Python, non-Python running in the same process is not aware of the
"Python UTF-8 mode".

Victor


More information about the Python-Dev mailing list