[Python-ideas] PEP 540: Add a new UTF-8 mode
M.-A. Lemburg
mal at egenix.com
Fri Jan 6 04:50:11 EST 2017
On 06.01.2017 04:32, Nick Coghlan wrote:
> On 6 January 2017 at 12:37, Victor Stinner <victor.stinner at gmail.com> wrote:
>> 2017-01-06 3:10 GMT+01:00 Stephen J. Turnbull
>> <turnbull.stephen.fw at u.tsukuba.ac.jp>:
>>> I've quoted Victor out of context, and his other posts make me very
>>> doubtful that he considers this a serious alternative. That said, I'm
>>> +1 on "don't!"
>>
>> The "always ignore locale and force UTF-8" option has supporters. For
>> example, Nick Coghlan wrote a whole PEP, PEP 538, to support this.
>
> Err, no, that's not what PEP 538 does. PEP 538 doesn't do *anything*
> if a locale is already properly configured - it only changes the
> locale if the current locale is "C".
Victor: I think you are taking the UTF-8 idea a bit too far.
Nick was trying to address the situation where the locale is
set to "C", or rather not set at all (in which case the lib C
defaults to the "C" locale). The latter is a fairly standard
situation when piping data on Unix or when spawning processes
which don't inherit the current OS environment.
The problem with the "C" locale is that the encoding defaults to
"ASCII" and thus does not allow Python to show its built-in
Unicode support.
Nick's PEP and the discussion on the ticket
http://bugs.python.org/issue28180 are trying to address this
particular situation, not enforce any particular encoding
overriding the user's configured environment.
So I think it would be better if you'd focus your PEP on the
same situation: locale set to "C" or not set at all.
BTW: You mention a locale "POSIX" in a few places. I have
never seen this used in practice and wonder why we should
even consider this in Python as possible work-around for
a particular set of features. The locale setting in your
environment does have a lot of influence on your user
experience, so forcing people to set a "POSIX" locale doesn't
sound like a good idea - if they have to go through the
trouble of correctly setting up their environment for Python
to correctly run, they would much more likely use the correct
setting rather than a generic one like "POSIX", which is
defined as alias for the "C" locale and not as a separate
locale:
http://pubs.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap07.html
> It's actually very similar to your PEP, except that instead of adding
> the ability to make CPython ignore the C level locale settings (which
> I think is a bad idea based on your own previous work in that area and
> on the way that CPython interacts with other C/C++ components in the
> same process and in subprocesses), it just *changes* those settings
> when we're pretty sure they're wrong.
... and this is taking the original intent of the ticket
a little too far as well :-)
The original request was to have the FS encoding default to
UTF-8, in case the locale is not set or set to "C", with the
reasoning being that this makes it easier to use Python in
situations where you have exactly this situations (see above).
Your PEP takes this approach further by fixing the locale
setting to "C.UTF-8" in those two cases - intentionally, with all
the implications this has on other parts of the C lib.
The latter only has an effect on the C lib, if the "C.UTF-8" locale
is available on the system, which it isn't on many systems,
since C locales have to be explicitly generated.
Without the "C.UTF-8" locale available, your PEP only affects
the FS encoding, AFAICT, unless other parts of the application
try to interpret the locale env settings as well and use their
own logic for the interpretation.
For the purpose of experimentation, I would find it better
to start with just fixing the FS encoding in 3.7 and
leaving the option to adjust the locale setting turned off
per default.
--
Marc-Andre Lemburg
eGenix.com
Professional Python Services directly from the Experts (#1, Jan 06 2017)
>>> Python Projects, Coaching and Consulting ... http://www.egenix.com/
>>> Python Database Interfaces ... http://products.egenix.com/
>>> Plone/Zope Database Interfaces ... http://zope.egenix.com/
________________________________________________________________________
::: We implement business ideas - efficiently in both time and costs :::
eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
http://www.egenix.com/company/contact/
http://www.malemburg.com/
More information about the Python-ideas
mailing list