[Python-Dev] What does a double coding cookie mean?

Serhiy Storchaka storchaka at gmail.com
Sat Mar 19 17:37:49 EDT 2016


On 19.03.16 19:36, Glenn Linderman wrote:
> On 3/19/2016 8:19 AM, Serhiy Storchaka wrote:
>> On 16.03.16 08:03, Serhiy Storchaka wrote:
>> I just tested with Emacs, and it looks that when specify different
>> codings on two different lines, the first coding wins, but when
>> specify different codings on the same line, the last coding wins.
>>
>> Therefore current CPython behavior can be correct, and the regular
>> expression in PEP 263 should be changed to use greedy repetition.
>
> Just because emacs works that way (and even though I'm an emacs user),
> that doesn't mean CPython should act like emacs.

Yes. But current CPython works that way. The behavior of Emacs is the 
argument that maybe this is not a bug.

> (4) there is no benefit to specifying the coding twice on a line, it
> only adds confusion, whether in CPython, emacs, or vim.
> (4a) Here's an untested line that emacs would interpret as utf-8, and
> CPython with the greedy regulare expression would interpret as latin-1,
> because emacs looks only between the -*- pair, and CPython ignores that.
>    # -*- coding: utf-8 -*- this file does not use coding: latin-1

Since Emacs allows to specify the coding twice on a line, and this can 
be ambiguous, and CPython already detects some ambiguous situations 
(UTF-8 BOM and non-UTF-8 coding cookie), it may be worth to add a check 
that the coding is specified only once on a line.




More information about the Python-Dev mailing list