[Python-Dev] PEP 538 (review round 2): Coercing the legacy C locale to a UTF-8 based locale

Nick Coghlan ncoghlan at gmail.com
Mon Jun 12 08:05:29 EDT 2017

On 12 June 2017 at 17:47, Martin (gzlist) <gzlist at googlemail.com> wrote:
> Thanks for replying to my points!
> On 12/06/2017, Nick Coghlan <ncoghlan at gmail.com> wrote:
>> `PYTHONIOENCODING=:strict` remains the preferred way of forcing strict
>> encoding checks on the standard streams, regardless of locale.
> Then the user of my script has to care that it's written in Python and
> set that specifically in their crontab or so on...

As Inada-san wrote, we think the right way to fix that is to make it
easier and safer for application developers to override the default
settings on the standard streams. At the moment, doing so requires
rebinding sys.stdin/out/err, which means you end up with multiple
Python level streams sharing the one underlying C stream, which can
cause problems.

The basic API for that was recently merged
(`TextIOWrapper.reconfigure()`), so it's now a matter of extending it
to also allow updating `encoding` and `errors`.

>> In addition to providing a reliable escape hatch with no other
>> potentially unwanted side effects (for when folks actually want the
>> current behaviour), the entry for the off switch in the CLI usage docs
>> also provides us with a convenient place to document the *default*
>> behaviour.
> The documentation aspect is an interesting consideration.
> Having thought about it a bit more, my preferred option is having the
> disable be if either LC_ALL or LC_CTYPE vars are exactly 'C', then
> don't override. Otherwise (including for LANG=C), force C.UTF-8. The
> CLI usage docs could have a LC_CTYPE entry that goes into details of
> why.

LC_ALL=C doesn't actually disable the locale coercion (i.e. we still
set LC_CTYPE). The coercion just doesn't have any effect, since LC_ALL
takes precedence.


Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

More information about the Python-Dev mailing list