
Hi, i did some profiling and testing of my data-fitting code. One of its core parts is doing some linear least squares, until now i used np.linalg.lstsq. Most of time the size a is (250, 7) and of b is (250, 800). Today i compared it to using pinv manually, to my surprise, it is much faster. I taught, both are svd based? Too check another computer i also run my test on wakari: https://www.wakari.io/nb/tillsten/linear_least_squares Also using scipy.linalg instead of np.linalg is slower for both cases. My numpy and scipy are both from C. Gohlkes website. If my result is valid in general, maybe the lstsq function should be changed or a hint should be added to the documentation. greetings Till

On Tue, Jan 8, 2013 at 1:17 PM, Till Stensitz <mail.till@gmx.de> wrote:
Hi, i did some profiling and testing of my data-fitting code. One of its core parts is doing some linear least squares, until now i used np.linalg.lstsq. Most of time the size a is (250, 7) and of b is (250, 800).
My guess is that this depends a lot on the shape try a is (10000, 7) and b is (10000, 1) Josef
Today i compared it to using pinv manually, to my surprise, it is much faster. I taught, both are svd based? Too check another computer i also run my test on wakari:
https://www.wakari.io/nb/tillsten/linear_least_squares
Also using scipy.linalg instead of np.linalg is slower for both cases. My numpy and scipy are both from C. Gohlkes website. If my result is valid in general, maybe the lstsq function should be changed or a hint should be added to the documentation.
greetings Till
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

On Tue, Jan 8, 2013 at 6:17 PM, Till Stensitz <mail.till@gmx.de> wrote:
Hi, i did some profiling and testing of my data-fitting code. One of its core parts is doing some linear least squares, until now i used np.linalg.lstsq. Most of time the size a is (250, 7) and of b is (250, 800).
Today i compared it to using pinv manually, to my surprise, it is much faster. I taught, both are svd based?
np.linalg.lstsq is written in Python (calling LAPACK for the SVD), so you could run the line_profiler over it and see where the slowdown is. An obvious thing is that it always computes residuals, which could be costly; if your pinv code isn't doing that then it's not really comparable. (Though might still be well-suited for your actual problem.) Depending on how well-conditioned your problems are, and how much speed you need, there are faster ways than pinv as well. (Going via qr might or might not, going via cholesky almost certainly will be.) -n

Nathaniel Smith <njs <at> pobox.com> writes:
An obvious thing is that it always computes residuals, which could be costly; if your pinv code isn't doing that then it's not really comparable. (Though might still be well-suited for your actual problem.)
Depending on how well-conditioned your problems are, and how much speed you need, there are faster ways than pinv as well. (Going via qr might or might not, going via cholesky almost certainly will be.)
-n
You are right. With calculating the residuals, the speedup goes down to a factor of 2. I had to calculate the residuals anyways because lstsq only returns the squared sum of the residuals, while i need every residual (as an input to optimize.leastsq). Josef is also right, it is shape depended. For his example, lstsq is faster. Maybe it is possible to make lstsq to choose its method automatically? Or some keyword to set the method and making other decompositions available.

On Wed, Jan 9, 2013 at 1:29 AM, Till Stensitz <mail.till@gmx.de> wrote:
Nathaniel Smith <njs <at> pobox.com> writes:
An obvious thing is that it always computes residuals, which could be costly; if your pinv code isn't doing that then it's not really comparable. (Though might still be well-suited for your actual problem.)
Depending on how well-conditioned your problems are, and how much speed you need, there are faster ways than pinv as well. (Going via qr might or might not, going via cholesky almost certainly will be.)
-n
You are right. With calculating the residuals, the speedup goes down to a factor of 2. I had to calculate the residuals anyways because lstsq only returns the squared sum of the residuals, while i need every residual (as an input to optimize.leastsq).
Same here. Unfortunately the residuals computed by the LAPACK function are in a different basis so aren't directly usable. I'd support adding a keyword to disable the usual computation of the sum of squares. Josef is also right, it is shape depended. For his example, lstsq is faster.
Maybe it is possible to make lstsq to choose its method automatically? Or some keyword to set the method and making other decompositions available.
QR without column pivoting is a nice option for "safe" problems, but it doesn't provide a reliable indication of rank reduction. I also don't find pinv useful once the rank goes down, since it relies on Euclidean distance having relevance in parameter space and that is seldom a sound assumption, usually it is better to reformulate the problem or remove a column from the design matrix. So maybe an 'unsafe', or less suggestively, 'fast' keyword could also be an option. IIRC, this was discussed on the scipy mailing list a year or two ago. Chuck

QR without column pivoting is a nice option for "safe" problems, but it doesn't provide a reliable indication of rank reduction. I also don't find pinv useful once the rank goes down, since it relies on Euclidean distance having relevance in parameter space and that is seldom a sound assumption, usually it is better to reformulate the problem or remove a column from the design matrix.
Oh, i always taught that lstsq is more or less using the same procedure as pinv. Maybe you can give me a hint which algorithm is the most stable, as system (sum of expontials) is not very stable? My numerical lectures were some year ago. greetings Till

On Tue, Jan 8, 2013 at 11:17 AM, Till Stensitz <mail.till@gmx.de> wrote:
Hi, i did some profiling and testing of my data-fitting code. One of its core parts is doing some linear least squares, until now i used np.linalg.lstsq. Most of time the size a is (250, 7) and of b is (250, 800).
Today i compared it to using pinv manually, to my surprise, it is much faster. I taught, both are svd based? Too check another computer i also run my test on wakari:
https://www.wakari.io/nb/tillsten/linear_least_squares
Also using scipy.linalg instead of np.linalg is slower for both cases. My numpy and scipy are both from C. Gohlkes website. If my result is valid in general, maybe the lstsq function should be changed or a hint should be added to the documentation.
Do you know if both are using Atlas (MKL)? Numpy will compile a default unoptimized version if there is no Atlas (or MKL). Also, lstsq is a direct call to an LAPACK least squares function, so the underlying functions themselves are probably different for lstsq and pinv. Chuck
participants (5)
-
Charles R Harris
-
josef.pktd@gmail.com
-
Nathaniel Smith
-
Till Stensitz
-
Till Stensitzki