Python, search engine and NLP...

Ype Kingma ykingma at accessforall.nl
Tue Apr 23 15:41:16 EDT 2002


Shagshag wrote:
> 
> Hello,
> 
> I'm interested in Pyhon but not sure it can fit my needs :
> 
> I have a huge search engine to write with many XML input/output, it
> must be able to index and process something like 3 GO of data, many
> text files, parse text and other natural language processing, not be
> too slow, and so on...
> 
> - Do you think Python can help me ???

Yes, definitely.
For starters, there's an XML parser in available in python.

> 
> - As by now i don't know Python, do you think that it will not be a
> waste of time even if I had to learn it ???

You'll easily recover your python learning time by writing the text
indexing scripts in python. These are not normally time critical and need
some experimentation for which scripting in python is a good fit.

> - Do you know about some Python's natural language processing module
> ???

There are several, I'd guess others will let you know.
 
> - No, as there are many Java NLP library, I should definitively do it
> with Java...

I have some experience using the Lucene search engine
with Jython (ie. python running in a JVM).
This combination will probably do what you need.
Looking around for other combinations never hurts, though.

> Thanks in advance,

My pleasure.
Ype

http://jakarta.apache.org/lucene
http://www.jython.org

-- 
email at xs4all.nl



More information about the Python-list mailing list