[Python-Dev] Python3 "complexity"

Nick Coghlan ncoghlan at gmail.com
Fri Jan 10 16:35:38 CET 2014


On 10 January 2014 13:32, Lennart Regebro <regebro at gmail.com> wrote:
> On Thu, Jan 9, 2014 at 10:06 AM, Kristján Valur Jónsson
> <kristjan at ccpgames.com> wrote:
>> Do I speak Chinese to my grocer because china is a growing force in the world?  Or start every discussion with my children with a negotiation on what language to use?
>
> No, because your environment have a default language. And Python has a
> default encoding. You only get problems when some file doesn't use the
> default encoding.

Putting this here because I found out today it's not in any of the
PEPs and folks have to go digging in mailing list archives to find it.
I'll add it to my Python 3 Q&A at some point.

The reason Python 3 currently tries to rely on the POSIX locale
encoding is that during the Python 3 development process it was
pointed out that ShiftJIS, ISO-2022 and various CJK codec are in
widespread use in Asia, since Asian users needed solutions to the
problem of representing kana, ideographs and other non-Latin
characters long before the Unicode Consortium existed.

This creates a problem for Python 3, as assuming utf-8 means we have a
high risk of corrupting user's data at least in Asian locales, as well
as anywhere else where non-UTF-8 encodings are common (especially when
encodings that aren't ASCII compatible are involved).

While the Python 3 status quo on POSIX systems certainly isn't ideal,
it at least means our most likely failure mode is an exception rather
than silent data corruption. One of the major culprits for that is the
antiquated POSIX/C locale, which reports ASCII as the system encoding.
One idea we're considering for Python 3.5 is to have a report of
"ascii" on a POSIX OS imply the surrogateescape error handler (at
least for the standard streams, and perhaps in other contexts), since
the OS reporting the POSIX/C locale almost certainly indicates a
configuration error rather than intentional behaviour.

Cheers,
Nick.

>
> //Lennart
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/ncoghlan%40gmail.com



-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


More information about the Python-Dev mailing list