[Python-ideas] Hooking between lexer and parser

s.krah stefan at bytereef.org
Sat Jun 6 15:36:28 CEST 2015



Neil Girdhar <mistersheik at gmail.com> wrote: 
> Along with the grammar, you also give it code that it can execute as it matches each symbol in a rule.  In Python for example, as it matches each argument passed to a function, it would keep track of the count of *args, **kwargs, and  keyword arguments, and regular arguments, and then raise a syntax error if it encounters anything out of order.  Right now that check is done in validate.c, which is really annoying.

Agreed.  For 3.4 it was possible to encode these particular semantics into the grammar
itself, but it would no longer be LL(1).

If I understood correctly, you wanted to handle lexing and parsing together.  How
would the INDENT/DEDENT tokens be generated?

For my private ast generator, I did the opposite: I wanted to formalize the token
preprocessing step, so I have:

    lexer -> parser1 (generates INDENT/DEDENT) -> parser2 (generates the ast directly)


It isn't slower than what is in Python right now and you can hook into the token stream
at any place.




Stefan Krah






-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150606/36680868/attachment.html>


More information about the Python-ideas mailing list