[Python-ideas] PEP 540: Add a new UTF-8 mode

Oleg Broytman phd at phdru.name
Thu Jan 5 12:16:38 EST 2017


Hi!

On Thu, Jan 05, 2017 at 04:38:22PM +0100, Victor Stinner <victor.stinner at gmail.com> wrote:
> Always use UTF-8
> ----------------
> 
> Python already always use the UTF-8 encoding on Mac OS X, Android and Windows.
> Since UTF-8 became the defacto encoding, it makes sense to always use it on all
> platforms with any locale.

   Please don't! I use different locales and encodings, sometimes it's
utf-8, sometimes not - but I have properly configured LC_* settings and
I prefer Python to follow my command. It'd be disgusting if Python
starts to bend me to its preferences.

> The risk is to introduce mojibake if the locale uses a different encoding,
> especially for locales other than the POSIX locale.

   There is no such risk for me as I already have mojibake in my
systems. Two most notable sources of mojibake are:

1) FTP servers - people create files (both names and content) in
   different encodings; w32 FTP clients usually send file names and
   content in cp1251 (Russian Windows encoding), sometimes in cp866
   (Russian Windows OEM encoding).

2) MP3 tags and play lists - almost always cp1251.

   So whatever my personal encoding is - koi8-r or utf-8 - I have to
deal with file names and content in different encodings.

Oleg.
-- 
     Oleg Broytman            http://phdru.name/            phd at phdru.name
           Programmers don't die, they just GOSUB without RETURN.


More information about the Python-ideas mailing list