[SciPy-user] scipy.sparse: coo_matrix ignores sum_duplicates=False

Nathan Bell wnbell at gmail.com
Mon Oct 13 10:35:23 EDT 2008


On Mon, Oct 13, 2008 at 10:04 AM, James Philbin <philbinj at gmail.com> wrote:
> I've filed this as trac #754, repeated here for visibility.
>
> ---
> Running scipy version 0.7.0.dev4763
>
> coo_matrix.tocsr + tocsc both ignore the sum_duplicates parameter:
>
> In [1]: from numpy import *
> In [2]: from scipy.sparse import *
> In [3]: data = array([1,1,1,1,1,1,1])
> In [4]: row  = array([0,0,1,3,1,0,0])
> In [5]: col  = array([0,2,1,3,1,0,0])
> In [6]: A = coo_matrix( (data,(row,col)), shape=(4,4))
> In [8]: A.tocsr(sum_duplicates=False).todense()
> Out[8]:
> matrix([[3, 0, 1, 0],
>        [0, 2, 0, 0],
>        [0, 0, 0, 0],
>        [0, 0, 0, 1]])
> In [9]: A.tocsc(sum_duplicates=False).todense()
> Out[9]:
> matrix([[3, 0, 1, 0],
>        [0, 2, 0, 0],
>        [0, 0, 0, 0],
>        [0, 0, 0, 1]])

Hi James,

Note that CSR.todense() implicitly sums duplicate entries (it's
essentially zeros((N,M)) += A).  You should find that the CSR
representation *does* contain the duplicate entries:

In [1]: from numpy import *
In [2]: from scipy.sparse import *
In [3]: data = array([1,1,1,1,1,1,1])
In [4]: row  = array([0,0,1,3,1,0,0])
In [5]: col  = array([0,2,1,3,1,0,0])
In [6]: A = coo_matrix( (data,(row,col)), shape=(4,4))
In [7]: B = A.tocsr(sum_duplicates=False)
In [8]: B.indptr
Out[8]: array([0, 4, 6, 6, 7], dtype=int32)
In [9]: B.indices
Out[9]: array([0, 2, 0, 0, 1, 1, 3], dtype=int32)
In [10]: B.data
Out[10]: array([1, 1, 1, 1, 1, 1, 1])
In [11]: B
Out[11]:
<4x4 sparse matrix of type '<type 'numpy.int64'>'
	with 7 stored elements in Compressed Sparse Row format>

-- 
Nathan Bell wnbell at gmail.com
http://graphics.cs.uiuc.edu/~wnbell/



More information about the SciPy-User mailing list