[Python-ideas] RFC: PEP 540 version 3 (Add a new UTF-8 mode)

Oleg Broytman phd at phdru.name
Thu Jan 12 12:18:39 EST 2017


On Thu, Jan 12, 2017 at 06:10:35PM +0100, Victor Stinner <victor.stinner at gmail.com> wrote:
> 2017-01-12 17:10 GMT+01:00 Oleg Broytman <phd at phdru.name>:
> >> Does it work to use a locale with encoding A for LC_CTYPE and a locale
> >> with encoding B for LC_MESSAGES (and others)? Is there a risk of
> >
> >    It does when B is a subset of A (ascii and koi8; ascii and utf8, e.g.)
> 
> My question is more when A and B encodings are not compatible.
[skip time example]
> Well, since we are talking about the POSIX locale which usually uses
> ASCII, it shouldn't be an issue in practice for the PEP 538. I was
> just curious :-)

   Of course you get mojibake. You can get mojibake even with compatible
encodings:

$ echo $LC_CTYPE
ru_RU.KOI8-R
$ LC_TIME=ru_RU.UTF-8 date
п╖я┌ я▐п╫п╡ 12 20:14:08 MSK 2017
^^^^^^^^^^^^^^^^^ mojibake!

$ echo $LC_CTYPE
ru_RU.UTF-8
$ LC_TIME=ru_RU.KOI8-R date
?? ??? 12 20:15:20 MSK 2017
^^^^^^ mojibake!

> Victor

Oleg.
-- 
     Oleg Broytman            http://phdru.name/            phd at phdru.name
           Programmers don't die, they just GOSUB without RETURN.


More information about the Python-ideas mailing list