Martin von Loewis wrote:
Would it be possible to write a Python syntax checker that doesn't stop processing at the first error it finds but instead tries to continue as far as possible (much like make -k) ?
In "Compilerbau", this is referred to as "Fehlerstabilisierung". I suggest to have a look at the dragon book (Aho, Seti, Ullman).
The common approch is to insert or remove tokens, using some heuristics. In YACC, it is possible to add error productions to the grammar. Whenever an error occurs, the parser assigns all tokens to the "error" non-terminal until it concludes that it can perform a reduce action.
A similar approach might work for the Python Grammar. For each production, you'd define a set of stabilization tokens. If these are encountered, then the rule would be considered complete. Everything is consumed until a stabilization token is found.
For example, all expressions could be stabilized with a keyword. I.e. if you encounter a syntax error inside an expression, you ignore all tokens until you see 'print', 'def', 'while', etc.
In some cases, it may be better to add input rather than removing it. For example, if you get an "inconsistent dedent" error, you could assume that this really was a consistent dedent, or you could assume it was not meant as a dedent at all. Likewise, if you get a single-quote start-of-string, with no single-quote until end-of-line, you just should assume there was one.
Adding error productions to ignore input until stabilization may be feasible on top of the existing parser. Adding tokens in the right place is probably harder - I'd personally go for a pure Python solution, that operates on Grammar/Grammar.
I think I'd prefer a Python solution too -- perhaps I could start out with tokenizer.py and muddle along that way. pylint from Aaron Waters should also provide some inspiration.