[Python-Dev] PEP 540: Add a new UTF-8 mode (v2)

Nick Coghlan ncoghlan at gmail.com
Wed Dec 6 07:58:00 EST 2017


On 6 December 2017 at 20:38, Victor Stinner <victor.stinner at gmail.com> wrote:
> Nick:
>> So if PEP 540 is going to implicitly trigger switching encodings, it
>> needs to specify whether it's going to look for the C locale or the
>> POSIX locale (I'd suggest C locale, since that's the actual default
>> that causes problems).
>
> I'm thinking at the test already used by check_force_ascii() (function
> checking if the LC_CTYPE uses the ASCII encoding or something else):
>
>     loc = setlocale(LC_CTYPE, NULL);
>     if (loc == NULL)
>         goto error;
>     if (strcmp(loc, "C") != 0) {
>         /* the LC_CTYPE locale is different than C */
>         return 0;
>     }

Yeah, the locale coercion code changes the locale multiple times to
make sure we have a coercion target that will actually work (and then
checks nl_langinfo as well, since that sometimes breaks on BSD
systems, even if the original setlocale() call claimed to work). Once
we've found a locale that appears to work though, then we configure
the LC_CTYPE environment variable, and reload the locale from the
environment.

It's all annoyingly convoluted and arcane, but it works well enough
for https://github.com/python/cpython/blob/master/Lib/test/test_c_locale_coercion.py
to pass across the full BuildBot fleet :)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


More information about the Python-Dev mailing list