[Tutor] Inverted Index

Kent Johnson kent37 at tds.net
Sat Nov 3 12:48:42 CET 2007


Dinesh B Vadhia wrote:
> A NumPy matrix (because we have to perform a dot matrix multiplication 
> prior to creating an inverted index).

Maybe something like this?

from collections import defaultdict
a = array(...)
index = defaultdict(list)
for i, x in ndenumerate(a):
   index[x].append(i)

This creates a dict whose keys are the values in the array and values 
are lists of indices where the value appears.

I don't know how your sparse matrix is represented but maybe you need 
some kind of filter in the loop to only record meaningful values.

http://www.scipy.org/Numpy_Example_List_With_Doc#ndenumerate

Kent



>  
> ----- Original Message -----
> 
> Dinesh B Vadhia wrote:
>  > Sure!  To create an inverted index of a very large matrix (M x N with
>  > M<>N and M>10m rows).  Most times the matrix will be sparse but
>  > sometimes it won't be.  Most times the matrix will consist of 0's and
>  > 1's but sometimes it won't.
> 
> How is the matrix represented? Is it in a numpy array? a dict? or...
> 
> Kent
>  > Dinesh B Vadhia wrote:
>  >  > Hello!  Anyone know of any example/cookbook code for implementing
>  >  > inverted indexes?
>  >
>  > Can you say more about what you are trying to do?
>


More information about the Tutor mailing list