A Suggestion for Python Colon Syntax

Tim Peters tim.one at home.com
Fri Dec 22 22:57:50 EST 2000


[William Djaja Tjokroaminata]
> Exactly.

Good.  Let's stop there <wink>.

> I just regretted the initial study that used newbies and they
> preferred colons.  At the present time, correct me if I am wrong, I
> think it is still rare that Python is one's first programming
> language.  People who start to use Python may already use some other
> popular programming languages.  Probably if we do another survey now,
> the result may be different.  I don't think survey is a bad thing.
> Richard Stallman sometimes asked opinions on the emacs newsgroup
> before he decided whether to go one way or the other.

So does Guido, but it's rare, and in this case you're ignoring that he
*already decided*.

A survey is not a usability study, and Guido only asks people when he
doesn't care how it turns out.  For example, when complex numbers were
introduced, he asked the numeric people to pick one of "i" or "j" to
indicate the imaginary part.  He couldn't care less, and they couldn't stop
arguing about it.  They picked "i".  People complain about that a *lot* more
than they complain about the trailing colon (although both complaints add up
to "not much"), but they're not getting a choice of "j" at this stage
either.

> ...
> It seems to me that the colon is redundant, because the next-level
> indentation is required afterwards.

Yes; I agreed to that in my first msg.  It doesn't matter, though, in the
sense that Python is not about minimizing keystrokes (not even so much as
Perl is, let alone extreme one-liner languages like APL or J).  It's much
more about maximizing readability.  It may be that you're in a minority who
does not find that the trailing colon increases readability?  It increases
readability for me, and that matches the testimony of most people who've
chimed in on this issue over the years.

> Isn't it the basic premise of Python to take the best features of
> other programming languages?

It has never seemed so to me, and I'm not aware of Guido ever having said
so.  As a design principle (sometimes honored in the breach), yes, borrowing
from other languages is very apparent in Python, as it is in virtually all
language design efforts.  In this particular case, Guido borrowed the rule
from ABC because he *loved* that part of ABC; most of the rest of ABC he
dropped.  It's never been a popularity contest, though (if popularity or
"surveys" had any role to play, every language would use curly braces <0.3
wink>).

> Probably a decade ago ABC was around, but now I think many more
> people have encountered Tcl rather than ABC.

ABC was never popular.  And I'm not sure what more people being familiar
with Tcl has to do with this.  That *may* influence a decision to add
semantic features (like, say, file events) someday, but is unlikely to have
any influence on future syntax -- the languages are worlds apart
syntactically.

> I can see clearly the big difference (as I have experienced) between
> using indentation and curly braces; it makes really consistent coding
> style.  However, I don't see the colons as falling into the same
> category.

Guido does.  Now what?

> Yes, Guido wants everybody to use colons at the end of the clauses,
> but right now what prevents someone to put semicolon at the end
> of every statement?

Some people do.  When they do it public, we ridicule them.  Then they
usually stop <wink>.

>  Just compare the two or three formats:
>
>     xxxx xxxx xxxx:
>         xxxx xxxx xxxx
>         xxxx xxxx xxxx
>
>     xxxx xxxx xxxx:
>         xxxx xxxx xxxx;
>         xxxx xxxx xxxx;
>
>     xxxx xxxx xxxx
>         xxxx xxxx xxxx
>         xxxx xxxx xxxx
>
> Which one do you think is the most consistent in layout?  To me, it
> is not the first one.

The most consistent would be this:

    xxxx xxxx xxxx
    xxxx xxxx xxxx
    xxxx xxxx xxxx

That is, consistency is a red herring.

> ...
> But the semicolon at the end of the statement break the Pythonic
> rule.

If you put them there, yes.  Happily, nobody does; for example, you won't
find a semicolon at the end of any code stmt in any .py file in the standard
distribution.  When conformity is voluntarily and universally achieved,
there's no need to legislate it.

> ...
> I am not too familiar with the parsing stuff.  However, in my
> simplistic opinion, Python is not a free-form language like C
> or Perl.  Therefore, probably it is reasonable for any parser
> to breaks a Python code first into lines, even in backward
> parsing, instead of parsing it token by token first.  In parsing
> backwards, can then it just detect first that the line is at
> different indentation level rather than try to detect the colon?

When you're writing code and hit the ENTER key, good editing environments
try to *suggest* sensible indentation for the new line.  In part, that
requires guessing whether the statement just ended opens a block (in which
case the new line should be indented more) or not (in *most* of which cases
the indentation should be duplicated from the statement that just ended).

Regular expressions don't suffice for this determination, so it's a lot of
painful character-at-a-time parsing.  The trailing colon is a great hint:
if the stmt that just ended doesn't end with a colon, there's no need to
endure the expense of further analysis (the line that just ended can't
possibly open a new block without that colon).  Speed is important here,
because the user expects the response to the ENTER key to appear
instantaneous, and the parsing code is usually written in an interpreted
language.  The Emacs Python mode is greatly helped by some parsing
primitives supplied by elisp and coded in C.  IDLE (and by inheritance, also
PythonWin, which shares IDLE's auto-indent code) has a much rougher time of
it, being coded in pure Python, and having the added speed burden of needing
to talk to a Tk text widget thru the Tkinter interface layer to find out
anything about what's in the buffer (even worse, that's indirected in Python
code too, so that PythonWin can talk to the Scintilla text widget instead).

Every quirk of the syntax is exploited mercilessly to reduce processing
time, and that's a hard job (I wrote both of those parsers, so I'm not just
guessing about that); the use of colons to open blocks is one of the quirks
that can be exploited a lot.  The IDLE code would need to be redone from
scratch without it (IDLE doesn't preserve any of the alphanumeric characters
now:  it squashes all runs of alphanumerics into a single "x" character,
because it can do that quickly, and then chew over far fewer total
characters at Python speed).


At this point, if you want to keep pushing this the way to do it is to open
a PEP:

    http://python.sourceforge.net/peps/

While a PEP needs an implementation before it can become final, the PEP
author need not write the implementation, and the PEP can be accepted before
an implementation is even started.  What you would need to do is make a
compelling (to Guido) case, and identify all the consequences and how
they'll be dealt with.  The process is covered in more detail in

    http://python.sourceforge.net/peps/pep-0001.html

I expect the PEP will be rejected, but that doesn't mean it will be.  Guido
did surprise me once, in about 1995 <wink>.

or-you-could-just-resolve-to-use-python-for-a-year-before-
    improving-it-ly y'rs  - tim





More information about the Python-list mailing list