[Python-Dev] PEP 540: Add a new UTF-8 mode (v2)

Guido van Rossum guido at python.org
Thu Dec 7 18:26:52 EST 2017

On Thu, Dec 7, 2017 at 3:02 PM, Victor Stinner <victor.stinner at gmail.com>

> 2017-12-06 5:07 GMT+01:00 INADA Naoki <songofacandy at gmail.com>:
> > And opening binary file without "b" option is very common mistake of new
> > developers.  If default error handler is surrogateescape, they lose a
> chance
> > to notice their bug.
> To come back to your original point, I didn't know that it was a
> common mistake to open binary files in text mode.

It probably is because in Python 2 it makes no difference on UNIX, and on
Windows the only difference is that binary mode preserves \r.

> Honestly, I didn't try recently. How does Python behave when you do that?
> Is it possible to write a full binary parser using the text mode? You
> should quickly get issues pointing you to your mistake, no?

You will quickly get decoding errors, and that is INADA's point. (Unless
you use encoding='Latin-1'.) His worry is that the surrogateescape error
handler makes it so that you won't get decoding errors, and then the
failure mode is much harder to debug.

--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20171207/5fbdae4a/attachment-0001.html>

More information about the Python-Dev mailing list