[Types-sig] Re: [String-SIG] Python parser in Python?

John Aycock aycock@csc.UVic.CA
Mon, 20 Dec 1999 09:33:20 -0800

| From: Paul Prescod <paul@prescod.net>
| Tim Peters wrote:
| > 
| > John Aycock's extremely general (any CF grammar, ambiguous or not) parsing
| > framework comes with a Python grammar.  
| It depends on Python's built-in lexer:
| import tokenize

The tokenize module doesn't interface with the lexer inside Python -- it
does its work using a set of ugly-looking regular expressions.

| Doesn't that remove the possibility for new keywords?

Not at all.  If the new keywords (here I'm assuming reserved words) are
of the same form as identifiers, as would most likely be the case, then
you can easily pick them out after tokenize splits them apart.  That's
what my Python lexer does: piggybacks on tokenize, then flags reserved
words.  (Some people advocate such a splitting of lexical analysis tasks
this way, into a scanner (tokenize) and a screener (postprocessing of

Of course, if you want odd-looking keywords, you could always modify
a provate copy of tokenize :-)