[New-bugs-announce] [issue39206] Modulefinder does not consider source file encoding
Nicholas Feix
report at bugs.python.org
Fri Jan 3 16:33:55 EST 2020
New submission from Nicholas Feix <snip3rnick at googlemail.com>:
The modulefinder._find_module(...) function returns file objects text mode for source modules using the system encoding.
ModuleFinder.load_module(...) can run into decoding issues when the source file encoding does not match the system default.
The prior implementation imp.find_module(...) detected the encoding correctly using the tokenize.detect_encoding(...) function.
With the following code segment the detection would work again with UTF-8 BOM and PEP 263 type cookies.
encoding = None
if 'b' not in mode:
with open(file_path, 'rb') as file:
encoding = tokenize.detect_encoding(file.readline)[0]
file = open(file_path, mode, encoding=encoding)
----------
components: Library (Lib)
messages: 359259
nosy: Nicholas Feix
priority: normal
severity: normal
status: open
title: Modulefinder does not consider source file encoding
type: behavior
versions: Python 3.8, Python 3.9
_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue39206>
_______________________________________
More information about the New-bugs-announce
mailing list