kay.schluehr at gmx.net
Wed Mar 4 05:41:34 CET 2009
On 2 Mrz., 23:14, Clarendon <jine... at hotmail.com> wrote:
> Thank you, Lie and Andrew for your help.
> I have studied NLTK quite closely but its parsers seem to be only for
> demo. It has a very limited grammar set, and even a parser that is
> supposed to be "large" does not have enough grammar to cover common
> words like "I".
> I need to parse a large amount of texts collected from the web (around
> a couple hundred sentences at a time) very quickly, so I need a parser
> with a broad scope of grammar, enough to cover all these texts. This
> is what I mean by 'random'.
> An advanced programmer has advised me that Python is rather slow in
> processing large data, and so there are not many parsers written in
> Python. He recommends that I use Jython to use parsers written in
> Java. What are your views about this?
> Thank you very much.
You'll most likely need a GLR parser.
which I tried once and found it to be broken.
Then there is the Spark toolkit
I checked it out years ago and found it was very slow.
Then there is bison which can be used with a %glr-parser declaration
and PyBison bindings
Bison might be solid and fast. I can't say anything about the quality
of the bindings though.
More information about the Python-list