[Python-Dev] Re: CML2 compiler slowness

Mon, 12 Mar 2001 16:13:01 -0800

| From esr@snark.thyrsus.com Mon Mar 12 15:14:33 2001
| It's the expression parser I generated with John
| Aycock's SPARK toolkit -- that's taking up an average of 26 seconds
| out of an average 28-second runtime.
|
| While I was at PC9 last week somebody mumbled something about Aycock's
| code being cubic in time.  I should have heard ominous Jaws-style
| theme music at that point, because that damn Earley-algorithm parser
| has just swum up from the deeps and bitten me on the ass.

Eric:

You were partially correctly informed.  The time complexity of Earley's
algorithm is O(n^3) in the worst case, that being the meanest, nastiest,
most ambiguous context-free grammar you could possibly think of.  Unless
you're parsing natural language, this won't happen.  For any unambiguous
grammar, the worst case drops to O(n^2), and for a set of grammars which
loosely coincides with the LR(k) grammars, the complexity drops to O(n).

In other words, it's linear for most programming language grammars.  Now
the overhead for a general parsing algorithm like Earley's is of course
greater than that of a much more specialized algorithm, like LALR(1).

The next version of SPARK uses some of my research work into Earley's
algorithm and improves the speed quite dramatically.  It's not all
ready to go yet, but I can send you my working version which will give
you some idea of how fast it'll be for CML2.  Also, I assume you're
supplying a typestring() method to the parser class?  That speeds things
up as well.

John