Hello,I was wondering what is the fastest way (format) to multiply a sparse matrix with a numpy array. Intuitively, a csr format multiplied with a numpy array which is fortran contiguous seems to be the fastest, but I have ran a few benchmarks and it seems otherwise. It is also mentioned here
http://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.sparse.csc_matrix.html that using csr matrices "may" be faster.
In [5]: X Out[5]: <11314x130107 sparse matrix of type '<type 'numpy.float64'>' with 1787565 stored elements in Compressed Sparse Row format> In [6]: _, n_features = X.shape In [9]: w_c = np.random.rand(n_features, 10) In [10]: w_f = np.asarray(w_c, order='f') In [13]: csc = sparse.csc_matrix(X) In [30]: %timeit X * w_f 10 loops, best of 3: 40.5 ms per loop In [31]: %timeit X * w_c 10 loops, best of 3: 37.3 ms per loop In [32]: %timeit csc * w_c 10 loops, best of 3: 24.3 ms per loop In [33]: %timeit csc * w_f 10 loops, best of 3: 27.3 ms per loop
It seems here, using a csc matrix is faster with a C-contiguous numpy array which is completely non-intuitive to me. Are there any hard rules for this? or is it data dependent?
Sorry for my noobish questions!--