On special syntax (was Re: Matlab vs Python ...)

Huaiyu Zhu huaiyu_zhu at yahoo.com
Tue Jul 18 20:21:04 EDT 2000


I sent this out this morning and it hasn't appeared.  Sorry if it
duplicates. 


On Tue, 18 Jul 2000, Moshe Zadka wrote:
[snip: a long python code and short perl equivalent]

> It's more verbose, and people are able to understand it without learning
> weird corners of the language. (BTW: as a former Perl hacker, I find the
> Perl code *extremely* readable, and I spend more time understanding the
> Python code. But that doesn't change the fact that Perl took me longer to
> master then Python, and even then I didn't really understand Perl. For
> example, could I lose the ";" in the Perl code? I think so, but I'm not
> sure)
> 

Ok, since the issue of additional binary operators has been repeatedly
compared with various features of Perl that people dislike, without much
details, it might be worthwhile to give it some more thoughts here.  We've
already seen that the "it has a small userbase" argument is largely vacuous.
I hope to show "it is like Perl" argument is also of similar nature.

So let me first describe what I dislike Perl.  The best I can summarize
include these aspects:

1. There are many special syntax zones: 
- The formats
- The here document
- The regular expressions
- Things within ${...}, @{...}, %{...}, etc
- trailing if, unless, etc
- file handling
- BEGIN {...} END {...}
- some other things that doesn't come to mind :-)

I say special syntax zone instead of special syntax constructs because these
things do not really have their own scope.  They are not atomic.  For
example, you can have different levels of variable substitution within
regular expressions.

The disadvantage of this is that these things really all live in a larger
scope but they still play by their own rules.  You have to switch your
mental context going from one zone to the other.

In contrast, python has very few special syntax zones.  For example, a
string literal has its own rule, whether it is used for formats, multiline
strings or regular expression.  They interact only through given interface,
like as arguments or return values of a function.  If you can use multiline
string in print statements, you can also use it elsewhere string is used.

2. The scope, precedence and context of the special zones vary greatly:
- It is hard to know where do $_, $1, etc get values
- regular expression can be like s/2+3/t+1/g, which is also valid math
  formula.  
- evaluation of rhs of assignment also depend on the context coming from
  lhs: list or scalar.  The whole thing can be part of expression.
- classes defined with package statement with silly scoping rules.
- print<<<EOF is a delimiter that doesn't work everywhere.
- You can't just substitute a name with its content. For example, @a is an
  array but its element is denoted $a[0]. Therefore only a is the name. But
  %a can be a completely different thing. 
- many other things that doesn't come to mind.

The disadvantage is that not only it requires to switch mental context, but
it make it difficult to figure out exactly which one to switch to, when.

In contrast, in python almost everything can be assigned to a name.  You can
substitute a name with its contents almost everywhere.  Scoping is simple.

3. There are too many magic variable and other hooks:
- The $_ $\, etc changes all aspects of both syntax and semantics.  You have
  to keep them all in mind to understand current code.
- Many magic things are invisible and automatic.  Even if you do not set
  them, you still need to remember all the defaults.
- name globbing using *var puts unrelated things that happen to share a
  name (array, hash, file handle, etc) together.
- many other things that I don't even know.

The disadvantage is that you also need to keep all the implicit things in
mind when reading code.

In contrast, python has very little implicit things.  The only magic things
are almost always given as methods of the form __iamspecial__.


If I could summarize the above I would say that the main disadvantage of
Perl is it has too much interdependencies so you either understand the whole
program or you don't understand at all.  It may be great for short programs
but terrible for large programs.

There is also a perception that Perl has too many special symbols that makes
Perl code look like line noise.  However, if we look carefully, the fault of
special symbols is not that they are non-alphanumeric, but that they
introducing scoping rules.  You have to really think hard to decipher the
scoping of things like @{$a->{$b}}[2].  (Sorry if it is wrong - I'm forgetting
how to make it right now.)


Now back to our topic.  Which aspect of the new operators look like Perl?
1. They do not introduce a special syntax zone.  Or if you insist, they
   introduce special zones of two characters.
2. They do not have any ambiguity of scoping, precedence and context. They
   obey exactly the same rules of the old operators.  They always appear
   between two objects, and produce a new object.  The operations do not
   depend on context other than precedence.
3. There are no more magic things than the definition __dotmul__ etc that
   associates the symbol with method name, just like other operators do.

There might be an argument that special characters are hard to read.  As I
said, they are hard to read mainly because of their magical effects.
Operator names have even less such effects than the familiar python
constructs like quote, bracket, colon and indentation.  Of course if it gets
too long like @#$% it will be causing readability problem of its own.  I
don't think we are going that route if they are defined within each module
through __methodname__.

In contrast, several of proposals for avoiding the new operators do look to
me like introducing special syntax zone delimited by various combination of
arcane combination of symbols.

I see the new operators as introducing absolutely no long range
interdependencies other than reusing existing operator precedence. 

I hope this clarifies some issue concerning just how special are these new
operators.  If there are examples comparing them with proposed special
syntax rules for xml, sql etc, please describe what they are, what they are
for, why they are not good, why the alternative is better, and what aspects
can summarize the comparison.  I'm sure we'd all be wiser after reading
those.

Huaiyu






More information about the Python-list mailing list