Looking for lots of words in lots of files

Diez B. Roggisch deets at nospam.web.de
Wed Jun 18 16:29:43 CEST 2008


brad wrote:

> Just wondering if anyone has ever solved this efficiently... not looking
> for specific solutions tho... just ideas.
> 
> I have one thousand words and one thousand files. I need to read the
> files to see if some of the words are in the files. I can stop reading a
> file once I find 10 of the words in it. It's easy for me to do this with
> a few dozen words, but a thousand words is too large for an RE and too
> inefficient to loop, etc. Any suggestions?

Use an indexer, like lucene (available as pylucene) or a database that
offers word-indices.

Diez



More information about the Python-list mailing list