[Python-Dev] Python syntax checker ?

James C. Ahlstrom jim@interet.com
Mon, 25 Sep 2000 09:55:56 -0400

Martin von Loewis wrote:
> > Would it be possible to write a Python syntax checker that doesn't
> > stop processing at the first error it finds but instead tries to
> > continue as far as possible (much like make -k) ?
> The common approch is to insert or remove tokens, using some
> heuristics. In YACC, it is possible to add error productions to the
> grammar. Whenever an error occurs, the parser assigns all tokens to
> the "error" non-terminal until it concludes that it can perform a
> reduce action.

The following is based on trying (a great learning experience)
to write a better Python lint.

There are IMHO two problems with the current Python
grammar file.  It is not possible to express operator
precedence, so deliberate shift/reduce conflicts are
used instead.  That makes the parse tree complicated
and non intuitive.  And there is no provision for error
productions.  YACC has both of these as built-in features.

I also found speed problems with tokenize.py.  AFAIK,
it only exists because tokenizer.c does not provide
comments as tokens, but eats them instead.  We could
modify tokenizer.c, then make tokenize.py be the
interface to the fast C tokenizer.  This eliminates the
problem of updating both too.

So how about re-writing the Python grammar in YACC in
order to use its more advanced features??  The simple
YACC grammar I wrote for 1.5.2 plus an altered tokenizer.c
parsed the whole Lib/*.py in a couple seconds vs. 30
seconds for the first file using Aaron Watters' Python
lint grammar written in Python.