[Python-checkins] gh-85679: Recommend `encoding="utf-8"` in tutorial (GH-91778)

miss-islington webhook-mailer at python.org
Mon May 2 04:45:19 EDT 2022


https://github.com/python/cpython/commit/a7d3de7fe650d393d63039cb0ed7c1320d050345
commit: a7d3de7fe650d393d63039cb0ed7c1320d050345
branch: 3.10
author: Miss Islington (bot) <31488909+miss-islington at users.noreply.github.com>
committer: miss-islington <31488909+miss-islington at users.noreply.github.com>
date: 2022-05-02T01:45:10-07:00
summary:

gh-85679: Recommend `encoding="utf-8"` in tutorial (GH-91778)

(cherry picked from commit 614420df9796c8a4f01e24052fc0128b4c20c5bf)

Co-authored-by: Inada Naoki <songofacandy at gmail.com>

files:
M Doc/tutorial/inputoutput.rst

diff --git a/Doc/tutorial/inputoutput.rst b/Doc/tutorial/inputoutput.rst
index 7f83c4d4612eb..b50063654e262 100644
--- a/Doc/tutorial/inputoutput.rst
+++ b/Doc/tutorial/inputoutput.rst
@@ -279,11 +279,12 @@ Reading and Writing Files
    object: file
 
 :func:`open` returns a :term:`file object`, and is most commonly used with
-two arguments: ``open(filename, mode)``.
+two positional arguments and one keyword argument:
+``open(filename, mode, encoding=None)``
 
 ::
 
-   >>> f = open('workfile', 'w')
+   >>> f = open('workfile', 'w', encoding="utf-8")
 
 .. XXX str(f) is <io.TextIOWrapper object at 0x82e8dc4>
 
@@ -300,11 +301,14 @@ writing. The *mode* argument is optional; ``'r'`` will be assumed if it's
 omitted.
 
 Normally, files are opened in :dfn:`text mode`, that means, you read and write
-strings from and to the file, which are encoded in a specific encoding. If
-encoding is not specified, the default is platform dependent (see
-:func:`open`). ``'b'`` appended to the mode opens the file in
-:dfn:`binary mode`: now the data is read and written in the form of bytes
-objects.  This mode should be used for all files that don't contain text.
+strings from and to the file, which are encoded in a specific *encoding*.
+If *encoding* is not specified, the default is platform dependent
+(see :func:`open`).
+Because UTF-8 is the modern de-facto standard, ``encoding="utf-8"`` is
+recommended unless you know that you need to use a different encoding.
+Appending a ``'b'`` to the mode opens the file in :dfn:`binary mode`.
+Binary mode data is read and written as :class:`bytes` objects.
+You can not specify *encoding* when opening file in binary mode.
 
 In text mode, the default when reading is to convert platform-specific line
 endings (``\n`` on Unix, ``\r\n`` on Windows) to just ``\n``.  When writing in
@@ -320,7 +324,7 @@ after its suite finishes, even if an exception is raised at some
 point.  Using :keyword:`!with` is also much shorter than writing
 equivalent :keyword:`try`\ -\ :keyword:`finally` blocks::
 
-    >>> with open('workfile') as f:
+    >>> with open('workfile', encoding="utf-8") as f:
     ...     read_data = f.read()
 
     >>> # We can check that the file has been automatically closed.
@@ -490,11 +494,15 @@ simply serializes the object to a :term:`text file`.  So if ``f`` is a
 
    json.dump(x, f)
 
-To decode the object again, if ``f`` is a :term:`text file` object which has
-been opened for reading::
+To decode the object again, if ``f`` is a :term:`binary file` or
+:term:`text file` object which has been opened for reading::
 
    x = json.load(f)
 
+.. note::
+   JSON files must be encoded in UTF-8. Use ``encoding="utf-8"`` when opening
+   JSON file as a :term:`text file` for both of reading and writing.
+
 This simple serialization technique can handle lists and dictionaries, but
 serializing arbitrary class instances in JSON requires a bit of extra effort.
 The reference for the :mod:`json` module contains an explanation of this.



More information about the Python-checkins mailing list