PEP/GSoC idea: built-in parser generator module for Python?
pmawhorter at gmail.com
Fri Mar 14 19:51:14 CET 2014
First of all, hi everyone, I'm new to this list.
I'm a grad student who's worked on and off with Python on various
projects for 8ish years now. I recently wanted to construct a parser
for another programing language in Python and was dissapointed that
Python doesn't have a built-in module for building parsers, which
seems like a common-enough task. There are plenty of different
3rd-party parsing libraries available, specialized in lots of
different ways (see e.g., ). I happened to pick one that seemed
suitable for my needs but didn't turn out to support the recursive
structures that I needed to parse. Rather than pick a different one I
just built my own parser generator module, and used that to build my
parser: problem solved.
It would have been much nicer if there were a fully-featured builtin
parser generator module in Python, however, and the purpose of this
email is to test the waters a bit: is this something that other people
in the Python community would be interested in? I imagine the route to
providing a built-in parser generator module would be to first canvass
the community to figure out what third-party libraries they use, and
then contact the developers of some of the top libraries to see if
they'd be happy integrating as a built-in module. At that point
someone would need to work to integrate the chosen third-party library
as a built-in module (ideally with its developers).
>From what I've looked at PyParsing and PLY seem to be standout parser
generators for Python, PyParsing has a bit more Pythonic syntax from
what I've seen. One important issue would be speed though: an
implementation mostly written in C for low-level parsing tasks would
probably be much preferrable to one written in pure Python, since a
builtin module should be geared towards efficiency, but I don't
actually know exactly how that would work (I've both extended and
embedded Python with/in C before, but I'm not sure how that kind of
project relates to writing a built-in module in C).
Sorry if this is a bit rambly, but I'm interested in feedback from the
community on this idea: is a builtin parser generator module
desirable? If so, would integrating PyParsing as a builtin module be a
good solution? What 3rd-party parsing module do you think would serve
best for this purpose?
More information about the Python-list