[Numpy-discussion] svd error checking vs. speed

Skipper Seabold jsseabold at gmail.com
Sat Feb 15 17:12:37 EST 2014


On Sat, Feb 15, 2014 at 5:08 PM, <josef.pktd at gmail.com> wrote:

> On Sat, Feb 15, 2014 at 4:56 PM, Sebastian Berg
> <sebastian at sipsolutions.net> wrote:
> > On Sa, 2014-02-15 at 16:37 -0500, alex wrote:
> >> Hello list,
> >>
> >> Here's another idea resurrection from numpy github comments that I've
> >> been advised could be posted here for re-discussion.
> >>
> >> The proposal would be to make np.linalg.svd more like scipy.linalg.svd
> >> with respect to input checking.  The argument against the change is
> >> raw speed; if you know that you will never feed non-finite input to
> >> svd, then np.linalg.svd is a bit faster than scipy.linalg.svd.  An
> >> argument for the change could be to avoid issues reported on github
> >> like crashes, hangs, spurious non-convergence exceptions, etc. from
> >> the undefined behavior of svd of non-finite input.
> >>
> >
> > +1, unless this is a huge speed penalty, correctness (and decent error
> > messages) should come first in my opinion, this is python after all. If
> > this is a noticable speed difference, a kwarg may be an option (but
> > would think about that some more).
>
> maybe -1
>
> statsmodels is using np.linalg.pinv which uses svd
> I never ran heard of any crash (*), and the only time I compared with
> scipy I didn't like the slowdown.
> I didn't do any serious timings just a few examples.
>
> (*) not converged, ...
>
> pinv(x.T).dot(x) -> pinv(x.T, please_don_t_check=True).dot(y)
>
> numbers ?
>

FWIW, I see this spurious SVD did not converge warning very frequently with
ARMA when there is a nan that has creeped in. I usually know where to find
the problem, but I think it'd be nice if this error message was a little
better.

Skipper


>
> grep: we also use scipy.linalg.pinv in some cases
>
> Josef
>
>
> >
> > - Sebastian
> >
> >> """
> >> [...] the following numpy code hangs until I `kill -9` it.
> >>
> >> ```
> >> $ python runtests.py --shell
> >> $ python
> >> Python 2.7.5+
> >> [GCC 4.8.1] on linux2
> >> >>> import numpy as np
> >> >>> np.__version__
> >> '1.9.0.dev-e3f0f53'
> >> >>> A = np.array([[1e3, 0], [0, 1]])
> >> >>> B = np.array([[1e300, 0], [0, 1]])
> >> >>> C = np.array([[1e3000, 0], [0, 1]])
> >> >>> np.linalg.svd(A)
> >> (array([[ 1.,  0.],
> >>        [ 0.,  1.]]), array([ 1000.,     1.]), array([[ 1.,  0.],
> >>        [ 0.,  1.]]))
> >> >>> np.linalg.svd(B)
> >> (array([[ 1.,  0.],
> >>        [ 0.,  1.]]), array([  1.00000000e+300,   1.00000000e+000]),
> >> array([[ 1.,  0.],
> >>        [ 0.,  1.]]))
> >> >>> np.linalg.svd(C)
> >> [hangs forever]
> >> ```
> >> """
> >>
> >> Alex
> >> _______________________________________________
> >> NumPy-Discussion mailing list
> >> NumPy-Discussion at scipy.org
> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >>
> >
> >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at scipy.org
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140215/48c9285a/attachment.html>


More information about the NumPy-Discussion mailing list