[Python-ideas] Hooking between lexer and parser

Sat Jun 6 06:27:14 CEST 2015

On Fri, Jun 5, 2015 at 7:57 PM, Neil Girdhar <mistersheik at gmail.com> wrote:

> I don't see why it makes anything simpler.  Your lexing rules just live
> alongside your parsing rules.  And I also don't see why it has to be faster
> to do the lexing in a separate part of the code.  Wouldn't the parser
> generator realize that that some of the rules don't use the stack and so
> they would end up just as fast as any lexer?
>

You're putting a lot of faith in "modern" parsers. I don't know if PLY
qualifies as such, but it certainly is newer than Lex/Yacc, and it unifies
the lexer and parser. However I don't think it would be much better for a
language the size of Python.

We are using PLY at Dropbox to parse a medium-sized DSL, and while at the
beginning it was convenient to have the entire language definition in one
place, there were a fair number of subtle bugs in the earlier stages of the
project due to the mixing of lexing and parsing. In order to get this right
it seems you actually have to *think* about the lexing and parsing stages
differently, and combining them in one tool doesn't actually help you to
think more clearly.

Also, this approach doesn't really do much for the later stages -- you can
easily construct a parse tree but it's a fairly direct representation of
the grammar rules, and it offers no help in managing a symbol table or
generating code.

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150605/ffa263c9/attachment.html>