[soc2008-general] Proposal for a PEG(parsing expression grammar) parser generator for Python

Robert Bradshaw robertwb at math.washington.edu
Tue Mar 18 13:55:28 CET 2008


If you're interested in processing Python code you might want to  
consider writing up a project for http://www.cython.org/ . We already  
have a parser though, and we are not looking to completely replace  
it, but there are still lots of interesting project ideas.

I can't speak for everyone though, perhaps there are other people  
that would be interested in an actual PEG for Python.

- Robert


On Mar 18, 2008, at 5:23 AM, Chiyuan Zhang wrote:

> Hello,
>
> I'm interested in participating in GSoC 2008. I'm a student from
> Zhejiang University of China. I'm majoring in Computer Science and
> Technology. I'm taking a course on compiling this term. We are using
> the classical LALR (Left-to-right parse, Rightmost-derivation, with
> look-ahead)[1] way. But I've heard another way of parsing: parsing
> expression grammar, or PEG[2].
>
> Parsing expression grammars look similar to regular expressions or
> context-free grammars (CFG) in Backus-Naur form (BNF) notation, but
> have a different interpretation. Unlike CFGs, PEGs are not ambiguous;
> if a string parses, it has exactly one valid parse tree. This suits
> PEGs well to parsing computer languages, but not natural languages.
>
> There's a PEG parser generator for Ruby named Treetop[3]. It follows a
> cool DSL way. Here's part of the example taken from Treetop homepage:
>
>     grammar Arithmetic
>       rule additive
>         multitive '+' additive {
>           def value
>             multitive.value + additive.value
>           end
>         }
>         /
>         multitive
>       end
>
>       # other rules below ...
>     end
>
> But there seems no Python tool for PEGs (except PyPy rlib parsing[4]
> as a packrat parser generator). So I'm willing to implement such a
> Treetop-like PEG parser generator for Python. I'd like this to be a
> project of GSoC for the Python Software Foundation. Is there anyone
> interested in being my mentor for this project?
>
> As to myself, I have experience working with open source people. I had
> been reporting bugs or providing patches to open source communities. I
> also have some project myself. Here're two examples:
>
>  * RMMSeg[5]: An implementation of the MMSEG maximum-matching Chinese
>    word segmentation algorithm for Ruby.
>
>  * YASnippet[6]: Yet another snippet extension for Emacs. It is a
>    (much better) replacement of smart-snippet[7] (also my work). It
>    provides a simple but powerful template facility like the one
>    present in TextMate. If you are an Emacser, you should check it
>    out! :D
>
> I have read your expectations for Google Summer of Code students on  
> the
> Python wiki. I think I satisfy the expectations except that it is
> sometimes difficult for me use the IRC, mainly due to the time zone
> problem.
>
> So, would you consider this proposal? If yes, I'd be very happy. If
> no, I'm also interested in applying some other projects (of PSF or
> other mentoring organizations).
>
> --------------
> References:
>  [1] LALR on Wikipedia: http://en.wikipedia.org/wiki/LALR_parser
>  [2] PEG on Wikipedia:
>      http://en.wikipedia.org/wiki/Parsing_expression_grammar
>  [3] Treetop project homepage: http://treetop.rubyforge.org/
>  [4] PyPy rlib parsing document:
>      http://codespeak.net/pypy/dist/pypy/doc/rlib.html#parsing
>  [5] RMMSeg project homepage: http://rmmseg.rubyforge.org/
>  [6] YASnippet project homepage: http://yasnippet.googlecode.com/
>  [7] smart-snippet project homepage:
>      http://smart-snippet.googlecode.com/
>
> --------------
> Other links:
>  * My Blog (mainly Chinese): http://pluskid.lifegoo.com/
>  * My Email address: pluskid at gmail.com
> _______________________________________________
> soc2008-general mailing list
> soc2008-general at python.org
> http://mail.python.org/mailman/listinfo/soc2008-general



More information about the soc2008-general mailing list