[issue28923] Nonexisting encoding specified in Tix.py

Terry J. Reedy report at bugs.python.org
Tue Dec 13 14:05:06 EST 2016


Terry J. Reedy added the comment:

I reread
https://docs.python.org/27/reference/lexical_analysis.html#encoding-declarations
A first or second line must be a comment matching "coding[=:]\s*([-\w.]+)" (which IDLE uses) and the captured name "must be recognized by Python".

I also did some experiments.  Apparently, "iso-latin-1-unix" is recognized by Python.  On Windows, from an IDLE editor,
  # coding: iso-latin-1-unix
runs, while 
  # coding: xiso-latin-1-unix
raises, during the compile(..., 'file', 'exec') call:
  SyntaxError: unknown encoding: xiso-latin-1-unix

Since codecs.lookup() returns the same error for both lines:
  LookupError: unknown encoding: iso-latin-1-unix
compile() must be doing something other than simply calling codecs.lookup.  I suspect it somehow recognizes 'iso', 'latin-1', and 'unix' as valid chunks of an ecoding name.  (The last might even be an obsolete legacy item.)  Whatever it is, it is not obviously available to tools written in Python.

Note that 'recognized as a legitimate encoding name' and 'available on a particular installation' are different concepts. I believe codecs.lookup implements the latter.

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue28923>
_______________________________________


More information about the Python-bugs-list mailing list