[issue10335] tokenize.open_python(): open a Python file with the right encoding

Mon Nov 8 16:28:16 CET 2010

Alexander Belopolsky <belopolsky at users.sourceforge.net> added the comment:

On Sun, Nov 7, 2010 at 4:24 AM, STINNER Victor <report at bugs.python.org> wrote:
..
> Ok, the new patch (tokenize_open-2.patch) uses tokenize.open() name and adds a
> test for BOM without coding cookie (test utf-8-sig encoding).

Here are my comments on the new patch:

1. A nit-pick: tokenize.py is already inconsistent in this respect,
but I believe it is conventional not to start a docsting with an empty
line:

+def open(filename):
+    """Open a Python script in read mode with the right encoding.

instead of

+def open(filename):
+    """
+    Open a Python script in read mode with the right encoding.

Also, I would prefer a more neutral description that will not imply
that the method is only useful for Python source code.  For example:

"""Detect encoding and open a file in read mode with the right encoding.

(optionally a more detailed explanation or a reference to detect_encoding.)
"""

2. Another nit-pick: detect_encoding() returns a list of 0-2 lines in
the second item, not a single line, so it is better to write

+    encoding, lines = detect_encoding(buffer.readline)

or even

+    encoding, _ = detect_encoding(buffer.readline)

instead of

+    encoding, line = detect_encoding(buffer.readline)

3. In test_open(self), I think you should use self.assertEqual()
instead of a plain assert.

4. Last nit-pick (feel free to ignore as prejudice to %-formatting :-):

instead of

+                print("# coding: %s" % encoding, file=fp)

you can write

+                print("# coding: " + encoding, file=fp)

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue10335>
_______________________________________