Looking for lots of words in lots of files
francis.girardpython at gmail.com
Wed Jun 18 17:10:15 CEST 2008
Use a suffix tree. First make yourself a suffix tree of your thousand files
and the use it.
This is a classical problem for that kind of structure.
Just search "suffix tree" or "suffix tree python" on google to find a
definition and an implementation.
(Also Jon Bentley's "Programming Pearls" is a great book to read)
2008/6/18 brad <byte8bits at gmail.com>:
> Just wondering if anyone has ever solved this efficiently... not looking
> for specific solutions tho... just ideas.
> I have one thousand words and one thousand files. I need to read the files
> to see if some of the words are in the files. I can stop reading a file once
> I find 10 of the words in it. It's easy for me to do this with a few dozen
> words, but a thousand words is too large for an RE and too inefficient to
> loop, etc. Any suggestions?
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Python-list