lexing nested parenthesis

François Pinard pinard at iro.umontreal.ca
Fri Aug 2 18:13:34 CEST 2002


[Kristian Ovaska]

> [PLY]
> >1) it looks like not supported anymore

> Good open source never dies, though. :)

Someone will come up, one day, with a mix using the best of SPARK and PLY.
If only I had more free time and work power, I only foresee the tip of
all those wonderful things I would do! :-) We are all alike, aren't we?! :-)

So I guess PLY will live long, especially through such a sibling.

> > 3) I had to fight it so it accepts a scanner of my choice, yet I do
> > not remember the details as I write.

I remembered that PLY forces all token attributes to be constricted
into the single and `.value' attribute if the user want them to survive.
PLY considers all other token attributes to be reserved for its own use.
How egotistic! :-) PLY also uses a special mandatory end-of-file token.

Of course, PLY parsers rely on these characteristics of the scanner it
uses, so if you want to use another scanner, the scanner has to comply with
the above.  Moreover, at parse time, all token attributes besides `.value'
are dropped, you cannot access them to build syntax trees, for example.

> But it should have Parser and Scanner base classes which you extend.
> Maybe the author thought it would be easy to use if everything is
> top-level, so to say.

In my `Topy' project, I wanted parsers to be either SPARK parsers or PLY
parsers, depending on the language to parse, all derived from a common
parser base class.  If I remember well, it was breeze easy with SPARK, but
I had to fight against PLY just so to have a package hierarchy that works.
PLY assumes that your modules are flatly laid out in a single directory.

Another thing that bothered me in `Topy' if that you cannot easily have
many parsers derived from the same PLY grammar using different axioms
(start symbols), as the `ytab' file gets rebuilt if you change the axiom,
which is a fairly lengthy computation.  (Not to speak about the flurry
of diagnostics about unused rules when then, and the little control you
have over how or where diagnostics are produced.)  SPARK was pretty easy:
I could cache parsers according to their axiom, rebuilding each only once.

Another detail that bothered me with PLY is that error positioning is
simplistically reduced to a line number, both in the scanner and the parser.
In the context of source files including one another, this is weakish,
especially when I like pinpointing tokens to a precise column.  Scanners
generated through PLEX (which I learned only recently, for a project I
try to push forward in the little free time left to me) are much nicer of
these points.  It still means that going from PLEX to PLY means dropping
some positioning information.

-- 
François Pinard   http://www.iro.umontreal.ca/~pinard




More information about the Python-list mailing list