Is pyparsing really a recursive descent parser?
Just Another Victim of the Ambient Morality
ihatespam at hotmail.com
Sun Nov 4 21:05:10 EST 2007
"Kay Schluehr" <kay.schluehr at gmx.net> wrote in message
news:1194223837.104424.242360 at o3g2000hsb.googlegroups.com...
> On Nov 4, 10:44 pm, "Just Another Victim of the Ambient Morality"
> <ihates... at hotmail.com>
>
>> I believe there is a cure and it's called recursive descent parsing.
>> It's slow, obviously, but it's correct and, sometimes (arguably, often),
>> that's more important the execution speed.
>
> Recursive decendent parsing is not necessarily slow but from your
> remarks above I infer you want a general RD parser with backtracking:
> when one rule doesn't match, try another one to derive the current
> symbol in the input stream.
I think I've just discovered a major hurdle in my understand of the
problem.
You keep saying "with backtracking." Why? Isn't "backtracking"
inherent in recursion? So, why can't these alleged "recursive descent
parsers" find valid parsings? How are they not already backtracking? What
was the point of being recursive if not to take advantage of the inherent
backtracking in it?
Obviously, these parsers aren't recursing through what I think they
should be recursing. The question is "why not?"
Correct me if I'm wrong but I'm beginning to think that pyparsing
doesn't typically use recursion, at all. It only employs it if you create
one, using the Forward class. Otherwise, it does everything iteratively,
hence the lack of "backtracking."
> I'm not sure one needs to start again with a naive approach just to
> avoid any parser theory. For a user of a parser it is quite important
> whether she has to wait 50 seconds for a parse to run or 50
> milliseconds. I don't like to compromise speed for implementation
> simplicity here.
This attitude is all too prevalent among computer professionals... Of
course it's a useful thing to shield users from the intricacies of parser
theory! Just as much as it is useful to shield drivers from needing
automotive engineering or software users from programing. How many people
have come to this newsgroup asking about anomalous pyparsing behaviour,
despite their grammars being mathematically correct.
Think of it this way. You can force all the clients of pyparsing to
duplicate work on figuring out how to massage pyparsing to their grammars,
or you can do the work of getting pyparsing to solve people's problems,
once. That's what a library is supposed to do...
Finally, I can't believe you complain about potential speed problems.
First, depending on the size of the string, it's likely to be the difference
between 2ms and 200ms. Secondly, if speed were an issue, you wouldn't go
with a recursive descent parser. You'd go with LALR or the many other
parsing techniques available. Recursive descent parsing is for those
situations where you need correctness, regardless of execution time. These
situations happen...
I've said this before, albeit for a different language, but it applies
to Python just as well. I don't use Python to write fast code, I use it to
write code fast.
If _you_ "don't like to compromise speed for implementation simplicity"
then you have a plethora choices available to you. What about the guy who
needs to parse correctly and is unconcerned about speed?
More information about the Python-list
mailing list