a = b = 1 just syntactic sugar?

Mon Jun 9 00:02:06 EDT 2003

Quoth Ed Avis:
  [...]
> Still, wouldn't it work to parenthesize the whole lambda-expression to
> avoid ambiguity:
> 
>     swap = (lambda L,i,j: L[i], L[j] = L[j], L[i])
> 
> In this case a parse that took the RHS as a 3-tuple would not be
> possible, since the 'L[j] = L[j]' in the middle would be a statement
> and statements cannot appear inside tuples.

Thought you might suggest that.  :)

The problem with this plan is the amount of lookahead required for
the parser to decide among possible parse trees.

As noted, for backwards compatibility,
    (lambda L,i,j: L[i], L[j])
has to make a 2-tuple, not a function.  (The j in L[j] here must
be from the outer scope in this case, of course.)

So, once the parser has read as far as
    lambda L,i,j: L[i],
it has to look ahead, past the L[j], to see if there's a = there,
before it can decide whether the simple_stmt being assembled
includes this comma or not.

This is a problem because, as Martin has pointed out, the current
parser and grammar are LL(1); that 1 means that at most one token
of lookahead is ever needed for disambiguation.  The reasoning you
sketch above requires arbitrary lookahead (since the L[j] could be
replaced with an arbitrarily complex target list), and so cannot
be implemented in such a parser.  (This is not to say the parser
couldn't be replaced with one of a different type, but that's
obviously a much bigger deal than just tweaking the grammar.)

I actually think parentheses directly around the statement is a
more promising approach.

  [...]
> Others said that tuple-forming comma binds much more loosely than
> anything else, which suggests that this is not the case.

Not more loosely than =, for example, or statement delimiters such
as semicolons or newlines: e.g.,
    a, b = b, a
    a, b; print 3
are both fine.

In the grammar, precedence is implicit in the nesting of
productions, e.g., 'not' has a higher precedence than 'and' just
because not_test occurs in expansions of and_test and not
(directly) vice versa.  Likewise, commas making expression lists
and whatnot occur in the expansions of simple_stmt.

> (Would 'x, 3' be a statement anyway?)

Sure.  An expression_stmt, in fact.  (As you noted in a previous
post, any expression can serve as a statement.)

  [...]
> If you are right, then this is a serious objection and a good enough
> reason to reject the proposal.

I'm pretty sure I'm right on this point (that simply allowing
simple_stmt in the lambda body would cause a backwards-
incompatible parse of "lambda x: x, 3"), though it would be nice
for an actual expert to weigh in.

Whether it's reason enough to reject the proposal...  well, as the
proposal stands, sure.  A variant might work, though.

> However - I have been using the grammar at
> <http://python.org/doc/current/ref/grammar.txt>, which seems different
> to the file you quote (presumably from the Python source code).  

Right, Grammar/Grammar is part of the source.  (That's one reason
I like it; the implementation is always right!)

> [...] I
> think they are different enough to cause confusion, in particular, in
> grammar.txt a simple_stmt is not a list of statements and cannot
> contain ';'.  Could we continue using that file as reference, unless
> you believe that it is incorrect wrt the current Python
> implementation, in which case I will switch to Grammar/Grammar instead.

The differences are minor afaik, but, yes, occasionally confusing.
The business of semicolons, for example, is treated verbally in
the Language Reference rather than in the BNF: "Several simple
statements may occur on a single line separated by semicolons.".
What the Language Reference calls a simple_stmt, Grammar/Grammar
calls a small_stmt; the Language Reference seems not to give a
name to the semicolon-delimited list of simple statements.  (None
of this affects the point about commas above.)

I doubt there's a substantive difference between the two.  I
appealed to Grammar/Grammar just because I found what it said
about lambda to be slightly clearer than what the Language
Reference said.  I'll try to stick to the Language Reference from
here on in.

  [...]
> The point I would like to explore is how significant and how
> problematic these changes really are.  [...]

Quite.  I should admit here that my belief there will be
significant problems is just an intuition.

  [...]
> My feeling is that as long as no existing programs change meaning or
> become invalid [...] then the change is worth considering. [...]

Absolutely.  I hope my criticisms here haven't given the contrary
impression; by all means, if you can get it working, with no
change to the meaning of presently correct programs, all the power
to you -- write a PEP, get it discussed, maybe get it folded in.
It would be quite an achievement.

  [...]
> To do that I would need to understand a bit more about how a BNF
> grammar is interpreted; how the binding tightness of operators is
> decided, whether parsing at each level is greedy, and so on.

If you don't find anything adequate online, you might want to pick
up a copy of _Compiler Design in C_, by ... Alan Holub, if memory
serves.  I quite enjoyed it.

  [...]
-- 
Steven Taschuk    staschuk at telusplanet.net
"Tomorrow never happens."  -- Janis Joplin