[DB-SIG] indexing with python ?

chas sweeting@neuronet.com.my
Fri, 22 Aug 1997 04:22:23 +0800

Sorry to bother you all (anybody ?),

Before I end up recreating the wheel (and undoubtedly making it
a shoddy, square wheel), are there any resources or scripts which
deal with indexing of documents / databases with python ?

(or for that matter, the concepts/algorithms behind indexing
in general. the stuff i have found so far on the web is rather
weak... not that i expect Oracle to be giving away any secrets)

Anyway, it's just to create my own search utility for documents.
Sure, I can use the free Excite utility but I was actually interested
in how this works for my own interest and since I may need to
build this functionality into applications.

Data would be stored in pure text/html files or better still,
also in wordprocessor docs. For indexing in databases, I guess
we should really use internal functions provided for that database.

Thanks for any help / advice,


ps. a long time ago i did ask about converting from 
    several wordprocessor formats to html. some people asked 
    me to report back if i found anything - well, sorry to say
    that we ended up going with a great off-the-shelf java 
    solution :  www.net-it.com
    it did the job perfectly so we went with it.... client
    had the budget.

