[Tutor] token parser

Kent Johnson kent37 at tds.net
Sun Feb 11 13:54:30 CET 2007

Dj Gilcrease wrote:
> How would I go about writing a fast token parser to parse a string like
> "[4d6.takeHighest(3)+(2d6*3)-5.5]"
> and get a list like
> ['+',
>     ['takeHighest',
>         ['d',
>             4,
>             6
>         ],
>         3
>     ],
>     ['-',
>         ['*',
>             ['d',
>                 2,
>                 6
>             ],
>             3
>         ],
>         5.5
>     ]
> ]
> back? ( I put it all separated and indented like that so it is easier
> to read, it is for me anyways )

If your input is valid Python (which the above is not, 4d6 and 2d6 are 
not valid identifiers) then perhaps the compiler.parse() function would 
be a good starting point. It generates an abstract syntax tree which you 
could perhaps transform into the format you want:

In [13]: import compiler

In [19]: compiler.parse("[d6.takeHighest(3)+(d6*3)-5.5]")

Out[19]: Module(None, 
'takeHighest'), [Const(3)], None, None), Mul((Name('d6'), Const(3))))
), Const(5.5)))]))]))

If this doesn't work for you, then I would look to one of the many 
parser-generator packages available for Python. I don't know which is 
fastest; I have found pyparsing and PLY to be fairly easy to use. 
pyparsing comes with a lot of examples which might help you get started. 
Here are some summaries of the options:

Here is an article that gives some examples:
and the references in the above


More information about the Tutor mailing list