little question

Shagshag shagshag13 at yahoo.fr
Mon May 13 10:56:25 EDT 2002


> 
> "Inverted index" is an older term for a file with a list of words in a
> document and their offset into the document.  Exactly what you implemented
> with a dictionary, but normally implemented with a btree type index because
> full-text searching often needs things like stemming and wildcards (e.g.
> "tax*" getting hits on tax, taxes, taxation, ...)  The term "inverted index"
> has historical roots in bibliographic indexing and hasn't really had any
> consistent meaning in the database world.

Yes, that is what i mean by "inverted index", "inverted file"...
"An index into a set of texts of the words in the texts. The index is
accessed by some search method. Each index entry gives the word and a
list of texts, possibly with locations within the text, where the word
occurs"
from here http://www.nist.gov/dads/HTML/invertedIndex.html, there is
also an example...

> As for Mr. shagshag, I think this link has what you're looking for :)
> http://gnosis.cx/publish/programming/charming_python_15.txt

Thanks, i'll go and check,

S13.



More information about the Python-list mailing list