[Python-Dev] Late Python 3.7.1 changes to fix the C locale coercion (PEP 538) implementation
ncoghlan at gmail.com
Wed Sep 19 03:49:40 EDT 2018
I think the changes to both master and the 3.7 branch should be reverted.
For 3.7, I already said that I think we should just accept that that
ship has sailed with 3.7.0 and leave the as-shipped implementation
alone for the rest of the 3.7 series:
It isn't the way I intended it to work, but the kinds of large scale
architectural changes the intended implementation is designed to cope
with aren't going to happen on a maintenance branch anyway.
For 3.8, after Victor's rushed changes have been reverted, my PR
should be conflict free again, and we'll be able to get PEP 538 back
to working the way it was always supposed to work (while keeping the
genuine stdio handling fixes that Victor's refactoring provided):
On Tue, 18 Sep 2018 at 11:42, Ned Deily <nad at python.org> wrote:
> On Sep 17, 2018, at 21:20, Victor Stinner <vstinner at redhat.com> wrote:
> > tl; dr Nick, Ned, INADA-san: I modified 3.7.1 to add a new "-X
> > coerce_c_locale=value" option and make sure that the C locale coercion
> > cannot be when Python in embedded: are you ok with these changes?
> > Before 3.7.0 release, during the implementation of the UTF-8 Mode (PEP
> > 540), I changed two things in Nick Coghlan's implementation of the C
> > locale coercion (PEP 538):
> > (1) PYTHONCOERCECLOCALE environment variable is now ignored when -E or
> > -I command line option is used.
> > (2) When Python is embeded, the C locale coercion is now enabled if
> > the LC_CTYPE locale is "C".
> > Nick asked me to change the behavior:
> > https://bugs.python.org/issue34589
> > I just pushed this change in the 3.7 branch which adds a new "-X
> > coerce_c_locale=value" option:
> > https://github.com/python/cpython/commit/144f1e2c6f4a24bd288c045986842c65cc289684
> > Examples using Pyhon 3.7 (future 3.7.1) with UTF-8 Mode disabled, to
> > only test the C locale coercion:
> > ---
> > $ cat test.py
> > import codecs, locale
> > enc = locale.getpreferredencoding()
> > enc = codecs.lookup(enc).name
> > print(enc)
> > $ export LC_ALL= LC_CTYPE=C LANG=
> > # Disable C locale coercion: get ASCII as expected
> > $ PYTHONCOERCECLOCALE=0 ./python -X utf8=0 test.py
> > ascii
> > # -E ignores PYTHONCOERCECLOCALE=0:
> > # C locale is coerced, we get UTF-8
> > $ PYTHONCOERCECLOCALE=0 ./python -E -X utf8=0 test.py
> > utf-8
> > # -X coerce_c_locale=0 is not affected by -E:
> > # C locale coercion disabled as expected, get ASCII as expected
> > $ ./python -E -X utf8=0 -X coerce_c_locale=0 test.py
> > ascii
> > ---
> > For (1), Nick's use case is to get Python 3.6 behavior (C locale not
> > coerced) on Python 3.7 using PYTHONCOERCECLOCALE. Nick proposed to use
> > PYTHONCOERCECLOCALE even with -E or -I, but I dislike introducing a
> > special case for -E option.
> > I chose to add a new "-X coerce_c_locale=0" to Python 3.7.1 to provide
> > a solution for this use case. (Python 3.7.0 and older ignore this
> > option.)
> > Note: Python 3.7.0 is fine with PYTHONCOERCECLOCALE=0, we are only
> > talking about the special case of -E and -I options.
> > For (2), I modified Python 3.7.1 to make sure the C locale is never
> > coerced when the C API is used to embed Python inside an application:
> > Py_Initialize() and Py_Main(). The C locale can only be coerced by the
> > official Python program ("python3.7").
> > I don't know if it should be possible to enable C locale coercion when
> > Python is embedded. So I just made the change requested by Nick :-)
> > I dislike doing such late changes in 3.7.1, especially since PEP 538
> > has been designed by Nick Coghlan, and we disagree on the fix. But Ned
> > Deily, our Python 3.7 release manager, wants to see last 3.7 fixes
> > merged before Tuesday, so here we are.
> Just because the 3.7.1rc is scheduled doesn't mean we should throw something in, particularly if it's not fully reviewed and fully agreed upon. If it's important enough, we could delay the rc a few days ... or decide to wait for 3.7.2.
> > Nick, Ned, INADA-san: are you ok with these changes?
> > The other choices for 3.7.1 are:
> > * Revert my change: C locale coercion can still be enabled when Python
> > is embedded, -E option ignores PYTHONCOERCECLOCALE env var.
> > * Revert my change and apply Nick's PR 9257: C locale coercion cannot
> > be enabled when Python is embedded and -E option doesn't ignore
> > PYTHONCOERCECLOCALE env var.
> > I spent months to fix the master branch to support all possible
> > locales and encodings, and get a consistent CLI:
> > https://vstinner.github.io/python3-locales-encodings.html
> > So I'm not excited by Nick's PR which IMHO moves Python backward,
> > especially it breaks the -E option contract: it doesn't ignore
> > PYTHONCOERCECLOCALE env var.
> I would like to see Nick review the merged 3.7 PR and have both him and you agree that this is the thing to do for 3.7.1. I also want to make sure we understand what affect this will have on 3.7.0 users. Let's not potentially make things worse.
> I'm not planning to tag 3.7.1rc for at least another 18 hours. I'm marking bpo-34589 as "release blocker" and I will not proceed until this is resolved.
> Ned Deily
> nad at python.org -- 
Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia
More information about the Python-Dev