[Python-ideas] PEP 540: Add a new UTF-8 mode

Nick Coghlan ncoghlan at gmail.com
Sat Jan 7 20:08:01 EST 2017


On 8 January 2017 at 02:47, Stephen J. Turnbull
<turnbull.stephen.fw at u.tsukuba.ac.jp> wrote:
> I agree that people around me mostly know only two encodings: "works
> for me" and "mojibake", but they also use locales configured for them
> by technical staff.  On top of that, international students (the most
> likely victims of "UTF-8 by default" because students are the biggest
> Python users) typically have non-Japanese locales set on their
> imported computers.
>
> I'm not going to say my experience is typical enough to block "UTF-8
> by default", but let's do this very carefully with thought.

Unsurprisingly (given where I work [1]), one of my key concerns is to
enable large Python using institutions to be able to keep moving
forward, regardless of whether they've fully standardised their
internal environments on UTF-8 or not. As such, while I'm entirely in
favour of pushing people towards UTF-8 as the default choice
everywhere, I also want to make sure that system and application
integrators, including the folks responsible for defining the Standard
Operating Environments in large organisations, get warnings of
potential problems when they arise, and continue to get encoding
errors when we have definitive evidence of a compatibiliy problem.

For me, that boils down to:

- if a locale is properly configured, we'll continue to respect it
- if we're ignoring or changing the locale setting without an explicit
config option, we'll emit a warning on stderr that we're doing so
(*without* using the warnings system, so there's no way to turn it
into an exception)
- if a UTF-8 based Linux container is run on a
GB-18030/ISO-2022/Shift-JIS/etc host and tries to exchange locally
encoded data with that host (rather than exchanging UTF-8 encoded data
over a network connection), getting an exception is preferable to
silently corrupting the data stream

(I think I'll add something along those lines to PEP 538 as a new
"Core Design Principles" section)

Cheers,
Nick.

[1] https://docs.python.org/devguide/motivations.html#published-entries

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


More information about the Python-ideas mailing list