[Python-3000] It's a statement! It's a function! It's BOTH!

Sun Apr 2 02:52:11 CEST 2006

This is about the print / writeln debate.

Let me say up front that I don't expect this posting in its original form to
be widely accepted (In fact, I'll be disappointed if I don't get at least a
sum total of -5000 on responses to this.) At the same time, however, I think
that the issue that I am raising is legit, even if my solution is not.

Part of my issue is that I like both solutions. That is, the "professional
programmer" part of me likes the stream.writeln, as seen in many other
languages, in particular Java and C#. On the other hand, the "recreational
doodler" part of me remembers with fondness the simplicity and ease of
learning of my first experiences in BASIC, where it was completely intuitive
and natural to just say "print x" and not have to understand about the
complexities of line endings, function argument lists, and so on. And my
notion of "pythonic" includes the eschewing of needless delimiters and other
non-word characters when possible.

For the most part, the print statement isn't that much different from a
function, except that it doesn't have parentheses. (I'll discuss the
exceptions to this later.) That is, it takes an argument list which is
separated by commas, and each of those arguments is treated pretty much the
same.

However, Python doesn't recognize the general concept of a paren-less
function call, so that's why print has to be a special case, that's built
into the parser.

I've read a number of posts which all sort of lead up to the idea - why can't
we allow the interpreter to recognized paren-less functions? Now obviously we
wouldn't want every function to act this way, as it would lead to a nightmare
of ambiguities. So you'd have to have some means of telling the parsers which
functions were to be parsed as "statements" and which should not.

The difficult with that notion is that now you are asking the Python parser
to do something that it never did before, which is to make parsing decisions
based on semantics rather than just syntax. While there's no technical
difficulty with this, it goes against the Python tradition in a fairly
fundamental way.

But what the heck, lets forget tradition for a moment and see what would
happen if we were to go ahead and do it anyway.

We would start by defining a simple metalanguage for giving instructions to
the parser. I won't even try to suggest a syntax, but in essence this would
be something that has the same role as a macro language or preprocessor, in
that it is not part of the programming language itself, but instead describes
how the subsequent text is to be interpreted. (There's already some precedent
for this with the __future__ syntax.)

Specifically, one of the commands of this metalanguage would be a command to
parse an expression beginning with a specific keyword (such as "print") as a
statement rather than as an expression. A comma-separated list of arguments
would follow the initial identifier, and these would be treated as result
function arguments.

Thus you could have:

  print a, b, c
  send a, b, c
  read a, b, c

...and so on. Yes, its true - there is vast potential for abuse here, I don't
deny it. (With great power...etc.)

Now, about those exceptions:

One that's fairly easy to handle is the "print >> stream" syntax. We can
tweak the syntax for the "function as statement" so that instead of the rule
being:

  identifier [ arg, ... ]

it can be:

  expression [ arg, ... ]

Thus, the parser sees the word "print" which tells it that we're going to
have a paren-less function, but it still parses the text "print >> stream" as
an expression. Then its simply a matter of overloading the ">>" operator to
return a closure function that prints to the given stream. The semantics are
equivalent to:

  (print >> stream)( arg, arg, arg )

OK, so suppose you find this just too wide open for abuse. An alternative is
to dump the ">>" syntax and use a keyword argument:

  print stream=fh, arg, arg, arg

Since print is being executed as a normal function call, except without
parens, we would expect the normal keyword argument syntax to work, as well
as *args and **args. (Note that in this case the keyword argument is coming
before the non-keyword arguments, which is something that I hope will be
addressed in Python 3000.)

Now, what about the semantics of the "spaces between args"? There's a couple
of ideas:

 1) Define "print" as putting spaces between args, and "write" as not doing
so. 2) Have a keyword arg that allows specifying a separator char, where
space is the default:

  # No space between args
  print sep="", arg, arg, arg

Heck, why not even make it a function:

  # Insert enough spaces between args to align to 8-char boundaries
  print sep=tabToNext( 8 ), arg, arg, arg

Finally, there is the issue of the trailing comma to suppress the final
newline. I must confess that I don't have a clever solution in this case (I
can think of lots of hacky solutions...) I suspect that the best compromise
is to have distinct "print" and "println" functions to cover this case.

-- Talin