[Python-Dev] What does a double coding cookie mean?

Glenn Linderman v+python at g.nevcal.com
Sat Mar 19 13:36:45 EDT 2016


On 3/19/2016 8:19 AM, Serhiy Storchaka wrote:
> On 16.03.16 08:03, Serhiy Storchaka wrote:
>> On 15.03.16 22:30, Guido van Rossum wrote:
>>> I came across a file that had two different coding cookies -- one on
>>> the first line and one on the second. CPython uses the first, but mypy
>>> happens to use the second. I couldn't find anything in the spec or
>>> docs ruling out the second interpretation. Does anyone have a
>>> suggestion (apart from following CPython)?
>>>
>>> Reference: https://github.com/python/mypy/issues/1281
>>
>> There is similar question. If a file has two different coding cookies on
>> the same line, what should win? Currently the last cookie wins, in
>> CPython parser, in the tokenize module, in IDLE, and in number of other
>> code. I think this is a bug.
>
> I just tested with Emacs, and it looks that when specify different 
> codings on two different lines, the first coding wins, but when 
> specify different codings on the same line, the last coding wins.
>
> Therefore current CPython behavior can be correct, and the regular 
> expression in PEP 263 should be changed to use greedy repetition.

Just because emacs works that way (and even though I'm an emacs user), 
that doesn't mean CPython should act like emacs.

(1) CPython should not necessarily act like emacs, unless the coding 
syntax exactly matches emacs, rather than the generic coding that 
CPython interprets, that matches emacs, vim, and other similar things 
that both emacs and vim would ignore.
(1a) Maybe if a similar test were run on vim with its syntax, and it 
also works the same way, then one might think it is a trend worth 
following, but it is not clear to this non-vim user that vim syntax 
allows more than one coding specification per line.

(2) emacs has no requirement that the coding be placed on the first two 
lines. It specifically looks at the second line only if the first line 
has a “ #! ” or a “ '\" ” (for troff). (according to docs, not 
experimentation)

(3) emacs also allows for Local Variables to be specified at the end of 
the file.  If CPython were really to act like emacs, then it would need 
to allow for that too.

(4) there is no benefit to specifying the coding twice on a line, it 
only adds confusion, whether in CPython, emacs, or vim.
(4a) Here's an untested line that emacs would interpret as utf-8, and 
CPython with the greedy regulare expression would interpret as latin-1, 
because emacs looks only between the -*- pair, and CPython ignores that.
   # -*- coding: utf-8 -*- this file does not use coding: latin-1
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20160319/7acfa3af/attachment.html>


More information about the Python-Dev mailing list