[Python-Dev] PEP 540: Add a new UTF-8 mode (v2)

Greg Ewing greg.ewing at canterbury.ac.nz
Fri Dec 8 00:20:49 EST 2017


Victor Stinner wrote:
> Users don't use stdin and
> stdout as regular files, they are more used as pipes to pass data
> between programs with the Unix pipe in a shell like "producer |
> consumer". Sometimes stdout is redirected to a file, but I consider
> that it is expected to behave as a pipe and the regular TTY stdout.

It seems weird to me to make a distinction between stdin/stdout
connected to a file and accessing the file some other way.

It would be surprising, for example, if the following two
commands behaved differently with respect to encoding:

    cat foo | sort

    cat < foo | sort

> But Naoki explained that open() is commonly misused to open binary
> files and Python should somehow fail badly to notify the developer of
> their mistake.

Maybe if you *explicitly* open the file in text mode it
should default to surrogateescape, but use strict if text
mode is being used by default?

I.e.

    open("foo", "rt") --> surrogateescape
    open("foo")       --> strict

That way you can easily open a file in a way that's
compatible with the way stdin/stdout behave, but you
will get bitten if you mistakenly open a binary file
as text.

-- 
Greg



More information about the Python-Dev mailing list