[SciPy-User] Subclassing sparse matrices: problems with repr

Edward Grefenstette egrefen at gmail.com
Mon Jan 24 11:07:27 EST 2011

For a project I'm working on I'm building vectors with a large amount of 
zero values, and thought it would be cool (and efficient) to use scipy's 
sparse tools.

The basic deal is that these vector objects subclass lil_matrix (or some 
other sparse matrix class), set themselves up as an empty sparse matrix of 
dimensions based on the data they are being built from, and then fill 
themselves with values based on the data provided.

Here is the constructor:


from scipy.sparse import lil_matrix as sparsematrix

class nounVector(sparsematrix):

    def __init__(self, noun, corpusReader, basisMap, wordFilter = None, 
relFilter = None, basisList = None):
        # Some definitions
        self.noun = noun
        self.processedChunks = corpusReader
        self.basisMap = basisMap.getBasisMap()
        self.dimensions = len(basisMap.getBasisMap())
        self.relFilter = relFilter
        self.basisList = basisList
        # End of defs
        # Initialise self as sparse matrix
        sparsematrix.__init__(self, (1,self.dimensions))
        # Fill values based


The filling is done in self.buildVector(). No need to go into the details, 
but it basically does some processing of the data given to it by the corpus 
reader, calculates the count a particular item in the matrix needs to be 
incremented by, and then performs:

self[0,index] += count

So far, so good. In fact, everything works like I want it to (as far as I 
can tell). However, the problem arises when I try to print a list, dict etc 
of an instance of this class. As far as I can tell, it's all down to the 
__repr__ function. For example:

>>> a = nounVector(/*datastuffs*/)
>>> print a
... (Correct output)
>>> print str(a)
... (All good)
>>> print repr(a) # or print [a], etc...
Traceback (most recent call last):
  File "/Users/Edward/Workspace/CoSemantic Vectors/src/testVectors.py", line 
35, in <module>
  File "/Users/Edward/Workspace/CoSemantic Vectors/src/testVectors.py", line 
28, in main
    print repr(nounv)
line 158, in __repr__
    (self.shape + (self.dtype.type, nnz, _formats[format][1]))
KeyError: 'nou'

I have no clue what's gone wrong here or what this error means, and poking 
around in the source hasn't brought me much joy. Am I doing something wrong 
in my way of subclassing lil_matrix (etc)? Is something missing? Everything 
works on the functional side of things, but I'm afraid this sort of error is 
the tip of the iceberg and that other problems may crop up.

Any help, suggestions, criticism welcome. Thanks for reading, and thanks in 
advance for any info.


PS: I think I posted this earlier, but it didn't show up. Apologies if I 
double post.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20110124/96ffc61c/attachment.html>

More information about the SciPy-User mailing list