[Python-Dev] one more thing for 2.2?
Thomas Wouters
thomas@xs4all.net
Fri, 13 Jul 2001 00:42:27 +0200
On Thu, Jul 12, 2001 at 05:28:42PM -0400, Guido van Rossum wrote:
> > I have to agree that a nice, clean, powerful parser that can deal
> > better with ambiguities (an LR parser, is that what it's called ?
> > :P)
> Yes, why the :P)?
Because I was guessing, as I know practically naught about parsers and
parsing techniques. For instance, I was not aware that a yacc-based parser
would be LL(x) (for some small value of x). ':P)' was tongue-in-cheek,
followed by a closing parenthesis.
> > is a much better solution, but in some cases, a hack is better than
> > nothing.
>
> Adopting this particular hack means you can never go back. It
> effectively "unreserves" most keywords most of the time, and that
> means that you can no longer use other parser technologies to parse
> Python. E.g. suppose someone has a Yacc-based parser for Python. It
> would be quite a feat to hack the Yacc driver to do the same retrying
> that his hack does. I bet it would also require a major effort to get
> tokenize.py to work correctly again.
[ and ]
> An approach that might work for this is to pick a FEW keywords
> (e.g. those that are not reserved words in C or Java or C++) and add
> those to a FEW places in the grammar. E.g. add a rule
> extended_name: NAME | 'print' # plus a few others
>
> and then use extended_name instead of NAME in the rules for attribute
> selection and function definition:
>
> funcdef: 'def' extended_name parameters ':' suite
> .
> .
> .
> trailer: '(' [arglist] ')' | '[' subscriptlist ']' | '.' extended_name
> This would be unambiguous.
This has been discussed before. The main problem with this is that no one's
done it :) I've done a quick test-hack, but ran into somany unguarded
'STR(node)' calls in compile.c that expected a NAME, not an extended_name,
that I gave up. It also wouldn't really alleviate the tokenize.py problem --
if adding a few keywords-as-identifiers is doable, so is adding a lot of
them :) And there's the maintenance problem on the Grammar... when adding a
new keyword, you need to carefully consider where to allow it. However, it's
not like adding a new keyword is done more than once a lustrum ;)
But I don't have any real need for keywords as identifiers, so I don't mind
if we keep the current limitations.
--
Thomas Wouters <thomas@xs4all.net>
Hi! I'm a .signature virus! copy me into your .signature file to help me spread!