Where regexs listed for Python language's tokenizer/lexer?
Duncan Booth
duncan.booth at invalid.invalid
Sat Sep 12 17:12:48 EDT 2009
Paul McGuire <ptmcg at austin.rr.com> wrote:
> On Sep 12, 1:10 am, Chris Seberino <cseber... at gmail.com> wrote:
>> Where regexs listed for Python language's tokenizer/lexer?
>>
>> If I'm not mistaken, the grammar is not sufficient to specify the
>> language....
>> you also need to specify the regexs that define the tokens
>> right?..where is that?
>
> I think the OP is asking for the regexs that define the terminals
> referenced in the Python grammar, similar to those found in yacc token
> definitions. He's not implying that there are regexs that implement
> the whole grammar.
>
The OP should read
http://www.python.org/doc/current/reference/lexical_analysis.html
That has the BNF for those tokens that can be defined by BNF
such as identifiers, numbers and strings. The tricky bit is that INDENT and
DEDENT tokens depend on the context: see section 2.1.8.
More information about the Python-list
mailing list