[Python-Dev] What does a double coding cookie mean?
Serhiy Storchaka
storchaka at gmail.com
Sat Mar 19 17:37:49 EDT 2016
On 19.03.16 19:36, Glenn Linderman wrote:
> On 3/19/2016 8:19 AM, Serhiy Storchaka wrote:
>> On 16.03.16 08:03, Serhiy Storchaka wrote:
>> I just tested with Emacs, and it looks that when specify different
>> codings on two different lines, the first coding wins, but when
>> specify different codings on the same line, the last coding wins.
>>
>> Therefore current CPython behavior can be correct, and the regular
>> expression in PEP 263 should be changed to use greedy repetition.
>
> Just because emacs works that way (and even though I'm an emacs user),
> that doesn't mean CPython should act like emacs.
Yes. But current CPython works that way. The behavior of Emacs is the
argument that maybe this is not a bug.
> (4) there is no benefit to specifying the coding twice on a line, it
> only adds confusion, whether in CPython, emacs, or vim.
> (4a) Here's an untested line that emacs would interpret as utf-8, and
> CPython with the greedy regulare expression would interpret as latin-1,
> because emacs looks only between the -*- pair, and CPython ignores that.
> # -*- coding: utf-8 -*- this file does not use coding: latin-1
Since Emacs allows to specify the coding twice on a line, and this can
be ambiguous, and CPython already detects some ambiguous situations
(UTF-8 BOM and non-UTF-8 coding cookie), it may be worth to add a check
that the coding is specified only once on a line.
More information about the Python-Dev
mailing list