[New-bugs-announce] [issue14990] detect_encoding should fail with SyntaxError on invalid encoding

Florent Xicluna report at bugs.python.org
Sun Jun 3 12:29:02 CEST 2012

New submission from Florent Xicluna <florent.xicluna at gmail.com>:

I've hit this issue while playing with tokenize for the pep8.py module.

The tokenize detect_encoding() should report SyntaxError when the encoding is improperly declared.

However it raises a LookupError in some cases.

$ ./python -m tokenize Lib/test/bad_coding2.py 
unexpected error: unknown encoding: utf8-sig
Traceback (most recent call last):
  File "./Lib/runpy.py", line 162, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "./Lib/runpy.py", line 75, in _run_code
    exec(code, run_globals)
  File "./Lib/tokenize.py", line 686, in <module>
  File "./Lib/tokenize.py", line 656, in main
    tokens = list(tokenize(f.readline))
  File "./Lib/tokenize.py", line 489, in _tokenize
    line = line.decode(encoding)
LookupError: unknown encoding: utf8-sig

components: Library (Lib)
messages: 162205
nosy: flox
priority: normal
severity: normal
stage: needs patch
status: open
title: detect_encoding should fail with SyntaxError on invalid encoding
type: behavior
versions: Python 3.1, Python 3.2, Python 3.3

Python tracker <report at bugs.python.org>

More information about the New-bugs-announce mailing list