Re: [SciPy-Dev] scipy.sparse vs. pysparse
Hi, Here is some optimization reducing the runtime overhead of scipy.sparse matrix-vector multiplication by a factor of 5. https://github.com/pv/scipy-work/compare/master...enh/sparse-speedup And a patch against Scipy 0.9.0 (@Tony: maybe you want to try it out?): https://github.com/pv/scipy-work/compare/v0.9.0...enh/sparse-speedup-0.9.pat... *** Quick benchmark: http://dl.dropbox.com/u/5453551/bench_sparse.py (Multiply vector with 1-D CSR Laplacian operator.) After: % N scipy.sparse pysparse 32 7.169e-06 1.048e-06 64 7.367e-06 1.787e-06 128 7.814e-06 3.284e-06 256 8.633e-06 6.336e-06 512 1.025e-05 1.241e-05 1024 1.435e-05 2.455e-05 2048 1.989e-05 4.882e-05 4096 3.384e-05 9.798e-05 8192 6.098e-05 0.0001959 Before: % N scipy.sparse pysparse 32 3.708e-05 1.032e-06 64 3.736e-05 1.803e-06 128 3.777e-05 3.368e-06 256 3.95e-05 6.324e-06 512 4.116e-05 1.267e-05 1024 4.661e-05 2.455e-05 2048 5.38e-05 4.873e-05 4096 6.946e-05 9.763e-05 8192 9.563e-05 0.0001959 The cross-over occurs around N ~ 300 instead of around N ~ 3000. The main reason for the overhead is that multiplication with a sparse Laplacian is a pretty lightweight operation, so the fact that some of scipy.sparse is written in pure Python starts to matter. -- Pauli Virtanen
Hello, I apologize for not responding earlier. Thanks for taking the time to work on this, Pauli. I took the liberty of sending this to the mailing list since I believe that's where you meant to send your previous email, instead of directly to me. I managed to apply the patch (first time doing that :) ) and ran the same test script you supplied. I get similar speed-ups, but on my system pysparse seems to be faster than on your system, while scipy.sparse is slower. I'm inclined to believe that that has something to do with how I built scipy, though I can't think of any reason straight away. I was even a little less kind towards pysparse in my test, in that I used the high-level matrix class and A*b rather than the low-level A.matvec(b, x) - see below. Results of Pauli's test script on my Ubuntu system: After: % N scipy.sparse pysparse 32 9.585e-06 4.476e-07 64 1.021e-05 6.214e-07 128 1.054e-05 1.009e-06 256 1.158e-05 1.651e-06 512 1.328e-05 3.177e-06 1024 1.619e-05 6.085e-06 2048 2.251e-05 1.18e-05 4096 3.483e-05 2.314e-05 8192 5.997e-05 4.585e-05 16384 0.0001373 9.185e-05 32768 0.0002084 0.0001839 Before: % N scipy.sparse pysparse 32 5.375e-05 4.51e-07 64 5.47e-05 6.221e-07 128 5.563e-05 9.973e-07 256 5.579e-05 1.681e-06 512 5.723e-05 3.166e-06 1024 6.057e-05 6.074e-06 2048 6.736e-05 1.177e-05 4096 7.966e-05 3.063e-05 8192 0.0001047 4.587e-05 16384 0.0001538 9.193e-05 32768 0.0002546 0.0001832 I also tried my own benchmark that you can get here: http://dl.dropbox.com/u/2349184/pysparse_vs_scipy_dev.py The results (in micro-seconds): 1D: N SciPy pysparse 50 1.60e+01 5.99e+00 100 1.01e+01 6.31e+00 200 1.10e+01 7.39e+00 300 1.16e+01 8.25e+00 500 1.29e+01 1.02e+01 1000 1.71e+01 1.36e+01 2500 2.51e+01 2.47e+01 5000 4.03e+01 4.26e+01 10000 7.07e+01 8.15e+01 25000 1.62e+02 1.90e+02 50000 3.67e+02 4.90e+02 100000 1.14e+03 1.37e+03 2D: N=M^2 SciPy pysparse 100 1.05e+01 6.86e+00 625 1.58e+01 1.42e+01 2500 3.21e+01 3.82e+01 10000 9.81e+01 1.36e+02 40000 5.15e+02 9.93e+02 90000 1.40e+03 2.56e+03 250000 3.99e+03 7.79e+03 1000000 1.55e+04 3.36e+04 Comparing to my original email one can see that the 2D results are even more satisfying. The same factor-5 speedup at the smallest size and a significant decrease also at N=10000 (almost 50%). When I get around to it I will try this out with some more "realistic" matrices. Regards, Tony Stillfjord On Sat, Oct 1, 2011 at 5:37 PM, Pauli Virtanen <pav@iki.fi> wrote:
Hi,
Here is some optimization reducing the runtime overhead of scipy.sparse matrix-vector multiplication by a factor of 5.
https://github.com/pv/scipy-**work/compare/master...enh/** sparse-speedup<https://github.com/pv/scipy-work/compare/master...enh/sparse-speedup>
And a patch against Scipy 0.9.0 (@Tony: maybe you want to try it out?):
https://github.com/pv/scipy-**work/compare/v0.9.0...enh/** sparse-speedup-0.9.patch<https://github.com/pv/scipy-work/compare/v0.9.0...enh/sparse-speedup-0.9.pat...>
***
Quick benchmark: http://dl.dropbox.com/u/**5453551/bench_sparse.py<http://dl.dropbox.com/u/5453551/bench_sparse.py> (Multiply vector with 1-D CSR Laplacian operator.)
After:
% N scipy.sparse pysparse 32 7.169e-06 1.048e-06 64 7.367e-06 1.787e-06 128 7.814e-06 3.284e-06 256 8.633e-06 6.336e-06 512 1.025e-05 1.241e-05 1024 1.435e-05 2.455e-05 2048 1.989e-05 4.882e-05 4096 3.384e-05 9.798e-05 8192 6.098e-05 0.0001959
Before:
% N scipy.sparse pysparse 32 3.708e-05 1.032e-06 64 3.736e-05 1.803e-06 128 3.777e-05 3.368e-06 256 3.95e-05 6.324e-06 512 4.116e-05 1.267e-05 1024 4.661e-05 2.455e-05 2048 5.38e-05 4.873e-05 4096 6.946e-05 9.763e-05 8192 9.563e-05 0.0001959
The cross-over occurs around N ~ 300 instead of around N ~ 3000. The main reason for the overhead is that multiplication with a sparse Laplacian is a pretty lightweight operation, so the fact that some of scipy.sparse is written in pure Python starts to matter.
-- Pauli Virtanen
participants (2)
-
Pauli Virtanen
-
Tony Stillfjord