grammar question

Tue Feb 26 08:25:28 EST 2002

On 2/26/02 4:32 AM, in article m3pu2sfv0m.fsf at mira.informatik.hu-berlin.de,
"Martin v. Loewis" <martin at v.loewis.de> wrote:

> Nick Collier <nick at src.uchicago.edu> writes:
> 
>> Anyway, I thought that the Python grammar was LL(1) compliant and
>> yet when I translate the grammar into the Coco/R LL(1) parser
>> generator I get lots of errors like:
> 
> [...]
>> Where X and Y are productions such as arglist, etc.
> 
> My guess is that your translation into Coco/R is incorrect. I cannot
> spot any ambiguity in arglist. Considering
> 
> arglist: (argument ',')* (argument [',']| '*' test [',' '**' test] | '**'
> test)
> 
> it may be that, given "argument "," argument ","', it finds itself
> incapable of determining whether this is the second argument belongs
> to the first subexpression or the second one.

This production cannot be disambiguated top-down with *any* lookahead, for
the reason you suggest. Thus, it doesn't seem Python's grammar is LL(1). In
general, languages with infix operators and complex nested structures like
lists are not LL(k), because a parser can only decide the role of a
subexpression after it sees the operator or terminator that follows it. But
such languages are easily parsed bottom-up (shift-reduce) with bounded
lookahead. That rule (in context) can be disambiguated bottom-up with two
tokens of lookahead: once an 'argument" is parsed and pushed onto the
shift-reduce stack and a "," is found, if the next token is ")" then the top
of the stack is the second occurrence of "argument", otherwise it's the
first.

-- F