[Numpy-discussion] Sparse Arrays in NumPy?

Sat Jan 11 01:12:03 EST 2003

Hello all,

I have been trying to find a package/addon that will provide a sparse array
class to NumPy, or will at least trick NumPy to use a sparse array as a
regular array, to no avail.

By sparse array here, I donot mean a sparse matrix equation solver, but an
array class that accepts a "default value".  In other words, I would like to
instantiate a 1000x1000x1000 (1e9) array that will have at most 5-10%
populated (i.e. non-zero) elements.  The current NumPy will instantiate the
entire 1e9 array, which is a non-starter if you would like to calculate an
expression with say 4-5 arrays.  Instead, I'd like a class that will only
store the populated cells, and return the default value for the others
(ideally, but doing some smart disk I/O to preserve memory).

I've tried SciPy, Scientific Python, and a few other modules floating
around; none seem to do the trick, yet I can't help but wonder that this is
not un uncommon setup for a lot of problem domains.  Is there a package out
there?  If there isn't, where should I start looking to create one? From
their description I think SparseLib++ at least would be a good starting
point as a base library.

As a secondary issue, is anyone aware of a package that can handle storage
of such arrays?  netCDF and HDF do not seem to fit the bill; a B-Tree
library seems a more natural fit...

Thanks in advance --any and all input appreciated,

Costas