[SciPy-User] csr_matrix rows remove
David Warde-Farley
d.warde.farley at gmail.com
Sun Oct 7 01:04:26 EDT 2012
On Thu, Oct 4, 2012 at 9:05 AM, Pavel Lurye <pavel.lurye at gmail.com> wrote:
> Hi,
> I'm using scipy csr_matrix and I'm trying to figure out what is the
> simple and fast way to remove a row from such matrix?
> For example, I have a tuple of rows, that should be deleted. The only
> way I see, is to generate a tuple of matrix parts and vstack it.
> Please, help me out with this.
Unfortunately, CSR/CSC do not admit terribly efficient row deletion.
What would be required to do it semi-efficiently would be to determine
how many non-zero elements live in those rows (call this number k),
allocate 3 vectors (new_data, new_indices, new_indptr), mirroring the
.data, .indices and .indptr attributes of the sparse matrix object,
each of length nnz - k (where nnz is the number of non-zero elements
in the original matrix). First, copy the contents of mycsrmatrix.data
into new_data, omitting the ones in the deleted rows. Then things
become tricky: you need to adjust the values of indices and indptr to
account for the now missing rows. This would require reading up on the
CSR format, and would be relatively complicated but not impossible.
A simpler (but less efficient) implementation could convert to COO
format first, fiddle with the row/col/data vectors to get the right
subsets of elements, then adjust the row indices to account for the
decreases caused by rows that are no longer there, and then create
another COO matrix with the (data, ij) constructor form; then convert
back to CSR with .tocsr().
More information about the SciPy-User
mailing list