Python parser

Kay Schluehr kay.schluehr at gmx.net
Tue Mar 3 23:41:34 EST 2009


On 2 Mrz., 23:14, Clarendon <jine... at hotmail.com> wrote:
> Thank you, Lie and Andrew for your help.
>
> I have studied NLTK quite closely but its parsers seem to be only for
> demo. It has a very limited grammar set, and even a parser that is
> supposed to be "large" does not have enough grammar to cover common
> words like "I".
>
> I need to parse a large amount of texts collected from the web (around
> a couple hundred sentences at a time) very quickly, so I need a parser
> with a broad scope of grammar, enough to cover all these texts. This
> is what I mean by 'random'.
>
> An advanced programmer has advised me that Python is rather slow in
> processing large data, and so there are not many parsers written in
> Python. He recommends that I use Jython to use parsers written in
> Java. What are your views about this?
>
> Thank you very much.

You'll most likely need a GLR parser.

There is

http://www.lava.net/~newsham/pyggy/

which I tried once and found it to be broken.

Then there is the Spark toolkit

http://pages.cpsc.ucalgary.ca/~aycock/spark/

I checked it out years ago and found it was very slow.

Then there is bison which can be used with a %glr-parser declaration
and PyBison bindings

http://www.freenet.org.nz/python/pybison/

Bison might be solid and fast. I can't say anything about the quality
of the bindings though.



More information about the Python-list mailing list