[Python-Dev] Re: Automatic flex interface for Python?

Tim Peters tim.one@comcast.net
Thu, 22 Aug 2002 23:17:08 -0400


[Greg Ewing]
> Not necessarily! Plex manages to do it without any
> of that.
>
> The trick is to leave all the characters in the input
> buffer and just *count* how many characters make up
> the next token. Once you've decided where the token
> ends, one slice gives it to you.

Plex is very nice!  It doesn't pass my "convient and fast" test only because
the DFA at the end still runs at Python speed, and one character at a time
is still mounds slower than it could be in C.  Hmm.  But you can also
generate pretty reasonable C code from Python source now too!  You're going
to solve this yet, Greg.

Note that mxTextTools also computes slice indices for "tagging", rather than
build up new string objects.  Heck, that's also why Guido (from the start)
gave the regexp and string match+search gimmicks optional start-index and
end-index arguments too, and why one of the "where did this group match?"
flavors returns slice indices.  I think Eric has spent too much time
debugging C lately <wink>.