Newbie design problem

MartinRinehart at gmail.com MartinRinehart at gmail.com
Fri Dec 14 11:02:05 EST 2007


Jonathan Garnder said:

> Well, if using something like PLY ( http://www.dabeaz.com/ply/ ) is
> considered more Pythonic than writing your own parser and lexer...

Lex is very crude. I've found that it takes about half a day to
organize your token definitions and another half day to write a
tokenizer by hand. What's the point of the second half-day's work?

My hand-written tokenizer returns everything (white space tokens,
comment tokens) while Lex leaves these out. With a full token set you
can use the tokenizer to color-highlight text in an editor, to emit an
HTML version of source, process doc comments, etc.

Python sports a tokenizer module, http://docs.python.org/lib/module-tokenize.html,
but it's Python-specific. I'm working on a language for beginners,
defined at http://www.MartinRinehart.com/posters/decaf.html (an
11"x17" poster-like display.) Decaf, designed before I'd even looked
at Python, is surprisingly Pythonic.

But not totally Pythonic. I want an array of Token objects, not a list
of tuples, for example.



More information about the Python-list mailing list