Site search

Gandalf gandalf at geochemsource.com
Mon Feb 16 07:40:44 CET 2004


Oh, I'm sorry for the dumb e-mail. I think I found the solution. It is 
called DoxIndexer. It uses the Lupy search engine.

http://www.methods.co.nz/docindexer/

Thanks anyway

Gandalf wrote:

> Hi All! I need to create a "site search" feature for a website. I 
> would like to create a service which could be
> pointed to a directory. It should go over all subfolders, read all 
> HTML,ASP,PHP,TXT and PDF files, and
> create a table indexed by words. The most important would be...
>
> 1. It should index PDF files too. (The site contains many datasheets 
> so this is curical.)
> 2. It should not index special keywords inside HTML and PDF file (so 
> if somebody would search for "green" then it should only lookup "green 
> cables" and "green grass", but not <FONT COLOR="GREEN">)
>
> Is there a library out there that can do the task for me? I can easily 
> do all parts except parsing a file and gather keywords.
>
> Thanks in advance.
>
>   Laci 2.0
>
>
>






More information about the Python-list mailing list