Where regexs listed for Python language's tokenizer/lexer?
robert.kern at gmail.com
Sun Sep 13 01:07:05 CEST 2009
Dennis Lee Bieber wrote:
> On Fri, 11 Sep 2009 23:10:39 -0700 (PDT), Chris Seberino
> <cseberino at gmail.com> declaimed the following in
>> Where regexs listed for Python language's tokenizer/lexer?
>> If I'm not mistaken, the grammar is not sufficient to specify the
>> you also need to specify the regexs that define the tokens
>> right?..where is that?
> Pardon... I've been out of the "market", but I don't recall EVER
> seeing a "regex" used in a textbook for compiler/interpreter design.
> BNF (or Pascal's bubble diagram equivalent) has always been used to
> define the syntactical components in those books in my possession, and
> parsers (tokenizers) were written using those implied algorithms (if the
> first character is numeric or "." it starts a number, otherwise treat it
> as an identifier, etc.),
In actual implementations of lexers and the lexical analysis components of
parsers, regexes are fairly common. For example, from ply:
"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco
More information about the Python-list