Recommendation of a parser generator

Thu Aug 14 14:37:26 EDT 2003

Hi Andrew (and others who replied) - thanks for the extensive tip.
However I ended up in using Simpleparse, cuz (1) I read from Charming
Python column that SPARK is *very* slow (it uses Earley algorithm) (2)
Simpleparse turns out to be not that outdated - the latest one (2.0,
in alpha) was released in 2002.

I've since finished my implementation in simpleparse, and felt quite
satisfied with the setup. The only thing I found a bit lacking is the
documentation - the tutorials on their site are very helpful to get me
up and running, but for the other details I had to peek through the
source.

Hope this also helps others in deciding which parser geerator to use
(at least for formula-like texts).

"Andrew Dalke" <adalke at mindspring.com> wrote in message news:<bh1mjl$f6i$1 at slb5.atl.mindspring.net>...
> Fortepianissimo:
> > I'm about to start a new project which will be mostly written in
> > Python. The first task is to parse some formula-like expressions into
> > an internal data structure so they can be evaluated.
> 
> How close is this formula language to Python's?  For other projects
> I've punted the heavy work to Python's own parser, then filled in
> the bits I needed.  For example, suppose you have the expression
> 
>    a.b + c + s.find('d')
> 
> >>> import compiler
> >>> from compiler import visitor
> >>> s = "a.b + c + s.find('d')"
> >>> class GetNames(visitor.ASTVisitor):
> ...    def __init__(self):
> ...        self.names = {}
> ...    def visitName(self, obj):
> ...        self.names[obj.name] = 1
> ...
> 
> >>> a = compiler.parse(s)
> >>> names = compiler.walk(a, GetNames()).names.keys()
> >>> names
>  ['a', 'c', 's']
> >>>
> 
> Then get the values for a, c, and s, put them into a dict, and
> 
> >>> class A:
> ...     b = 5
> ...
> >>> eval(s, {"a": A, "c": 3, "s": ""})
>  7
> >>>
> 
> (Assuming I didn't make any mistakes - it's modified from an earlier
> exchange Alex and I had in c.l.py, titled "classes derived from dict
> and eval" and I didn't test all the changes.)
> 
> Failing that, I've been happy with SPARK as a parser generator,
> but as you read in the paper, it's slow compared to the other parsers
> that were benchmarked.
> 
> > This parser would be run extensively in the future, so speed is a
> > consideration,
> 
> Why is the parser performance the problem?  Most of the time
> is spent evaluating the result, right?  That's post-parsing.
> 
> The only time to worry about parsing performance is if you have a
> lot of different expressions coming in.  Otherwise, just cache the
> results, as Python does with .pyc files.
> 
> > I'd appreciate very much some expert suggestions from the group, like
> > on the speed, flexibility, portability, and the future prospect (like
> > to be adopted as the standard etc.).
> 
> I too would like a standard parser generator for Python.  I don't know
> the status of that activity.  As it is, SPARK is small enough that it's
> easy for me to include in my projects.
> 
>                     Andrew
>                     dalke at dalkescientific.com