[issue10335] tokenize.open_python(): open a Python file with the right encoding
report at bugs.python.org
Sat Nov 6 11:49:39 CET 2010
New submission from STINNER Victor <victor.stinner at haypocalc.com>:
In Python3, the following pattern becomes common:
with open(fullname, 'rb') as fp:
coding, line = tokenize.detect_encoding(fp.readline)
with open(fullname, 'r', encoding=coding) as fp:
It opens the file is opened twice, whereas it is unnecessary: it's possible to reuse the raw buffer to create a text file. And I don't like the detect_encoding() API: pass the readline function is not intuitive.
I propose to create tokenize.open_python() function with a very simple API: just one argument, the filename. This function calls detect_encoding() and only open the file once.
Attached python adds the function with an unit test and a patch on the documentation. It patchs also functions currently using detect_encoding().
open_python() only supports read mode. I suppose that it is enough.
components: Library (Lib), Unicode
title: tokenize.open_python(): open a Python file with the right encoding
versions: Python 3.2
Added file: http://bugs.python.org/file19518/open_python.patch
Python tracker <report at bugs.python.org>
More information about the Python-bugs-list