RE: [Python-Dev] Re: Automatic flex interface for Python?

Aug. 22, 2002


      [Gordon McMillan]
...
mxTextTools lets (encourages?) you to break all
the rules about lex -> parse. If you can (& want to)
put a good deal of the "parse" stuff into the scanning
rules, you can get a speed advantage. You're also
not constrained by the rules of BNF, if you choose
to see that as an advantage :-).
My one successful use of mxTextTools came after
using SPARK to figure out what I actually needed
in my AST, and realizing that the ambiguities in the
grammar didn't matter in practice, so I could produce
an almost-AST directly.
I don't expect anyone will have much luck writing a fast lexer using
mxTextTools *or* Python's regexp package unless they know quite a bit about
how each works under the covers, and about how fast lexing is accomplished
by DFAs.  If you know both, you can build a DFA by hand and painfully
instruct mxTextTools in the details of its construction, and get a very fast
tokenizer (compared to what's possible with re), regardless of the number of
token classes or the complexity of their definitions.  Writing to
mxTextTools directly is a lot like writing in an assembly language for a
character-matching machine, with all the pains and potential joys that
implies.  If I were Eric, I'd use Flex <wink>.

RE: [Python-Dev] Re: Automatic flex interface for Python?

Tim Peters