gandalf at geochemsource.com
Mon Feb 16 07:40:44 CET 2004
Oh, I'm sorry for the dumb e-mail. I think I found the solution. It is
called DoxIndexer. It uses the Lupy search engine.
> Hi All! I need to create a "site search" feature for a website. I
> would like to create a service which could be
> pointed to a directory. It should go over all subfolders, read all
> HTML,ASP,PHP,TXT and PDF files, and
> create a table indexed by words. The most important would be...
> 1. It should index PDF files too. (The site contains many datasheets
> so this is curical.)
> 2. It should not index special keywords inside HTML and PDF file (so
> if somebody would search for "green" then it should only lookup "green
> cables" and "green grass", but not <FONT COLOR="GREEN">)
> Is there a library out there that can do the task for me? I can easily
> do all parts except parsing a file and gather keywords.
> Thanks in advance.
> Laci 2.0
More information about the Python-list