I'm in favor of PLY going into stdlib with the caveat that there are some things about it that should probably be cleaned up and modernized.   For instance, the method by which it writes the cached parsing tables needs to be cleaned up.  I still think putting the LALR(1) generator code into a common library usable by both PLY/RPLY would be a useful thing to do.  That code is really hairy and non-trivial to understand without something like the Dragon book nearby (and even then it's not easy).   

So, if I were to make any kind of proposal, I would say, make a standard library module for just the LALR(1) generator code.   If the PLY interface is needed to add pycparser or cffi to the standard library, that can be added too, but as a separate module that uses the more general LALR(1) library.


On Jul 13, 2013, at 8:12 AM, Brett Cannon wrote:

On Sat, Jul 13, 2013 at 4:24 AM, Michael Foord <fuzzyman@voidspace.org.uk> wrote:

On 13 Jul 2013, at 07:41, Terry Reedy <tjreedy@udel.edu> wrote:

> On 7/13/2013 12:10 AM, Eric Snow wrote:
>> On Feb 27, 2013 4:31 AM, "Michael Foord" <fuzzyman@voidspace.org.uk
>> > +1 PLY is capable and well tried-and-tested. We used it in Resolver
>> One to implement a pretty large grammar and it is (in my opinion) best
>> of breed in the Python parser generator world. Being stable and widely
>> used, with an "available maintainer", makes it an ideal candidate for
>> standard library inclusion.
>> Is this still on the table?
> Who is the maintainer and what is his opinion?

The maintainer is David Beazley and as far as I recall he has not expressed an opinion on this particular question. It would obviously need his agreement (and maintenance commitment) if it is to fly.

Just because we have now had two conflicting replies on this: David is down with PLY being added, but Alex Gaynor was working on a cleanup called RPLY for RPython. Basically David said the two of them should work together to clean up PLY and then it should be good to proposing for the stdlib (e.g. there are some backwards-compatibility hacks which should be removed).

Below is David's original email on the topic from Feb 27:


Regarding the inclusion of PLY or some subcomponent of it in the standard library, it's not an entirely crazy idea in my opinion.  LALR(1) parsers have been around for a long time, are generally known to anyone who's used yacc/bison, and would be useful outside the context of cffi or pycparser.  PLY has also been around for about 12 years and is what I would call stable.  It gets an update about every year or two, but that's about it.   PLY is also relatively small--just two files and about 4300 lines of code (much of which could probably be scaled down a bit).

The only downside to including PLY might be the fact that there are very few people walking around who've actually had to *implement* an LALR(1) parser generator.  Some of the code for that is extremely hairy and mathematical.   At this time, I don't think there are any bugs in it, but it's not the sort of thing that one wants to wander into casually.    Also, there are some horrible hacks in PLY that I'd really like to get rid of, but am currently stuck with due to backwards compatibility issues.

Alex Gaynor has been working on a PLY variant (RPLY) geared at RPython and which has a slightly different programming interface.    I'd say if we were to go down this route, he and I should work together to put together some kind of more general "parsing.lalr" package (or similar) that  cleans it up and makes it more suitable as a library for building different kinds of parsing tools on top of.