[Python-Dev] open(): set the default encoding to 'utf-8' in Python 3.3?

Terry Reedy tjreedy at udel.edu
Tue Jun 28 19:06:44 CEST 2011


On 6/28/2011 10:46 AM, Paul Moore wrote:

> I use Windows, and come from the UK, so 99% of my text files are
> ASCII. So the majority of my code will be unaffected. But in the
> occasional situation where I use a £ sign, I'll get encoding errors,

I do not understand this. With utf-8 you would never get a string 
encoding error.

> where currently things will "just work".

As long as you only use the machine-dependent restricted character set.

 > And the failures will be data dependent, and hence intermittent
 > (the worst type of problem).

That is the situation now, with platform/machine dependencies added in.
Some people share code with other machines, even locally.

> So, in effect, you propose making the default favour writing
> multiplatform portable code at the expense of quick and dirty scripts?

Let us frame it another way. Should Python installations be compatible 
with other Python installations, or with the other apps on the same 
machine? Part of the purpose of Python is to cover up platform 
differences, to the extent possible (and perhaps sensible -- there is 
the argument). This was part of the purpose of writing our own io module 
instead of using the compiler stdlib. The evolution of floating point 
math has gone in the same direction. For instance, float now expects 
uniform platform-independent Python-dependent names for infinity and nan 
instead of compiler-dependent names.

As for practicality. Notepad++ on Windows offers ANSI, utf-8 (w,w/o 
BOM), utf-16 (big/little endian). I believe that ODF documents are utf-8 
encoded xml (compressed or not). My original claim for this proposal 
was/is that even Windows apps are moving to uft-8 and that someday 
making that the default for Python everywhere will be the obvious and 
sensible thing.

-- 
Terry Jan Reedy




More information about the Python-Dev mailing list