[OT] Syntax highlighting [was Re: Too much code - slicing]

Steven D'Aprano steve at REMOVE-THIS-cybersource.com.au
Mon Sep 20 08:29:57 CEST 2010

On Sun, 19 Sep 2010 07:36:11 +0000, Seebs wrote:

> On 2010-09-19, Steven D'Aprano <steve at REMOVE-THIS-cybersource.com.au>
> wrote:
>> I'm not entirely sure I agree with you here... you can't ignore syntax
>> in order to understand the meaning of code.
> No, but the syntax should be invisible.  When I read English, I don't
> have to think about nouns and verbs and such unless something is very
> badly written.

That's almost certainly because you've been listening to, speaking, 
reading and writing English since you were a small child, and the syntax 
and grammar of English is buried deep in your brain.

And you certainly do think about nouns and verbs, you just don't 
*consciously* think about them. If I write:

"Susan blooged the mobblet."

you will probably recognise "bloog" as the verb and "mobblet" as the 
noun, even though you've almost certainly never seen those words before 
and have no idea what they mean. But if I write this:

"Susan is mobblet the blooged."

you'll probably give a double-take. The words don't look right for 
English grammar and syntax.

I've been reading, writing and thinking in Python for well over a decade. 
The syntax and grammar is almost entirely invisible to me too. No 
surprise there -- they are relatively close to that of the human 
languages I'm used to (English). But if I were a native Chinese or Arabic 
speaker, I'd probably find Python much less "natural" and *would* need to 
explicitly think about the syntax more.

>> The term "syntax highlighting" for what editors I've seen do is
>> actually misleading -- they don't highlight *syntax*, they try to
>> highlight *semantics*.
> I've never seen this.  I've seen things highlight comments and keywords
> and operators and constants and identifiers differently.

Exactly. Things are highlighted because of *what* they are, not because 
of the syntax they use or because of the grammatical role they play.

In a Python expression like:

y = none or None

an editor might colour "None" green because it's a known keyword, but 
"none" black because it's a variable. If you change the syntax:

y = None if [none][0] is None else {None: none}[None]

the colours remain the same. None is coloured green not because of 
*where* it is in the syntax tree, but because of *what* it is. Calling 
this "syntax highlighting" is misleading, or at least incomplete.

>> When your editor highlights the function len() in the expression "x =
>> len(y) + spam(z)" but not the function spam(), you know it has nothing
>> to do with syntax. len() is singled out because of its semantics,
>> namely the fact that it's a built-in.
> Eww.  (I had not yet gotten to the point of finding out that whether
> something was "built-in" or not substantially affected its semantics.)

In some languages, built-in functions truly are special, e.g. they are 
reserved words. That's not the case for Python. Nevertheless, the editors 
I've used treat built-ins as "pseudo-reserved words" and colourise them.

>> In English, the meaning of the some sentences do benefit by syntax
>> highlighting, and in fact that's partly what punctuation is for:
>> English partly uses punctuation marks as tags to break the sentence
>> structure into sub-sentences, clauses and terms (particularly when the
>> sentence would otherwise be ambiguous).
> Punctuation is very different from highlighting, IMHO.  That said, I
> find punctuation very effective at being small and discrete, clearly not
> words, and easy to pick out.  Color cues are not nearly as good at being
> inobtrusive but automatically parsed.

Well that surely depends on the colour scheme you have. My editor is 
fairly restrained -- it uses a handful of colours (although of course you 
can customize it and go nuts), and I've made it even more subtle.

To my eyes, the feature of syntax highlighting that alone makes it 
worthwhile, its killer feature, is that I can set comments and docstrings 
to grey. When I'm scanning code, being able to slide my eyes over greyed-
out comments and docstrings and ignore them with essentially zero effort 
is a huge help. That's the thing I most miss, more than anything else, 
when using a dumb editor.

>> "Woman shoots man with crossbow"
>> Was it the man armed with a crossbow, or the woman? If we could somehow
>> group the clause "with crossbow" with "woman" or "man" by something
>> *other* than proximity, we could remove the ambiguity.
> Yes.  But syntax highlighting won't help you here -- at least, I've
> never yet seen any editor that showed precedence relations or anything
> similar in its coloring.

Just because nobody has done it yet doesn't mean that some sufficiently 
intelligent software in the future couldn't do it :)


More information about the Python-list mailing list