[Linux-SIG] PEP 538: Coercing the legacy C locale to C.UTF-8

Nick Coghlan ncoghlan at gmail.com
Wed Jan 4 21:09:31 EST 2017


On 5 January 2017 at 04:08, Robert Collins <robertc at robertcollins.net> wrote:
> One thing: is the warning on stderr really needed? It's pretty poor form for
> the VM to communicate with the user except in extraordinary circumstances
> like exception handler of last resort.

I think it's appropriate in this case, as the major problem here is
that the assumption of ASCII as the preferred text encoding in the "C"
locale is an antiquated Anglo-centric default in *C* that has been
superceded by UTF-8 or UTF-16-LE in newer languages and runtimes.
While I'm not planning to hold my breath, I kinda hope we'll see a
change over the next several years where C.ASCII and POSIX.ASCII are
introduced as aliases for the current C and POSIX locales, with those
two names subsequently transitioning to become aliases for C.UTF-8
(with the latter being reported as the canonical locale name so
fallbacks like the one proposed in PEP 538 don't trigger).

So what this PEP is doing is taking CPython and saying that we no
longer respect the default locale handling behaviour defined by the C
language standards, as we think they're wrong (based on our
experiences attempting to use them in a multilingual locale dependent
application), and so we're overruling them. If people insist on using
the default locale anyway, their Python 3 runtime isn't going to work
properly, and it isn't a bug in CPython, it's a bug in their
environmental configuration.

That gives the rationale for the two different warnings:

- overruling the default C locale after decades of respecting it (or
at least trying to) is a big deal, and hence worth warning about
- Python 3's Unicode support is genuinely unreliable in the C locale
(everywhere other than Mac OS X), and hence worth warning about

> We have verbose mode, and warning on that would make sense to me.

Folks that encounter Python 3's deficiencies in the C locale (or hit
integration issues arising from the new implicit locale
reconfiguration behaviour) without a runtime warning of some kind
aren't likely to think "I should run Python in verbose mode to learn
more about what's going on", they're more likely to think "Python 3 is
broken, I'm going to use something else".

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


More information about the Linux-sig mailing list