[Python-Dev] PEP 540: Add a new UTF-8 mode (v2)
Glenn Linderman
v+python at g.nevcal.com
Thu Dec 7 20:15:45 EST 2017
On 12/7/2017 4:48 PM, Victor Stinner wrote:
>
> Ok, now comes the real question, open().
>
> For open(), I used the example of a code snippet *writing* the content
> of a directory (os.listdir) into a text file. Another example is to
> read filenames from a text files but pass-through undecodable bytes
> thanks to surrogateescape.
>
> But Naoki explained that open() is commonly misused to open binary
> files and Python should somehow fail badly to notify the developer of
> their mistake.
So the real problem here is that open has a default mode of text.
Instead of forcing the user to specify either "text" or "binary" when
opening, text is used as a default, binary as an option to be specified.
I understand that default has a long history in Unix-land, dating at
last as far back as 1977 when I first learned how to use the Unix open()
function.
And now it would be an incompatible change to change it.
The real question is whether or not it is a good idea to change it... at
this point in time, with Unicode and UTF-8 so prevalent, text and binary
modes are far different than back in 1977, when they mostly just
documented that this was a binary file that was being opened, and that
one could more likely expect to see read() than fgets() in the following
code.
If it were to be changed, one could add a text-mode option in 3.7, say
"t" in the mode string, and a PendingDeprecationWarning for open calls
without the specification of either t or b in the mode string.
In 3.8, the warning would be changed to DeprecationWarning.
In 3.9, all open calls would need to have either t or b, or would fail.
Meanwhile, back on the PEP 540 ranch, text mode open calls could
immediately use surrogateescape, binary mode open calls would not, and
unspecified open calls would not.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20171207/6d07eaa0/attachment.html>
More information about the Python-Dev
mailing list