[Numpy-discussion] svd error checking vs. speed

josef.pktd at gmail.com josef.pktd at gmail.com
Sat Feb 15 17:08:31 EST 2014


On Sat, Feb 15, 2014 at 4:56 PM, Sebastian Berg
<sebastian at sipsolutions.net> wrote:
> On Sa, 2014-02-15 at 16:37 -0500, alex wrote:
>> Hello list,
>>
>> Here's another idea resurrection from numpy github comments that I've
>> been advised could be posted here for re-discussion.
>>
>> The proposal would be to make np.linalg.svd more like scipy.linalg.svd
>> with respect to input checking.  The argument against the change is
>> raw speed; if you know that you will never feed non-finite input to
>> svd, then np.linalg.svd is a bit faster than scipy.linalg.svd.  An
>> argument for the change could be to avoid issues reported on github
>> like crashes, hangs, spurious non-convergence exceptions, etc. from
>> the undefined behavior of svd of non-finite input.
>>
>
> +1, unless this is a huge speed penalty, correctness (and decent error
> messages) should come first in my opinion, this is python after all. If
> this is a noticable speed difference, a kwarg may be an option (but
> would think about that some more).

maybe -1

statsmodels is using np.linalg.pinv which uses svd
I never ran heard of any crash (*), and the only time I compared with
scipy I didn't like the slowdown.
I didn't do any serious timings just a few examples.

(*) not converged, ...

pinv(x.T).dot(x) -> pinv(x.T, please_don_t_check=True).dot(y)

numbers ?

grep: we also use scipy.linalg.pinv in some cases

Josef


>
> - Sebastian
>
>> """
>> [...] the following numpy code hangs until I `kill -9` it.
>>
>> ```
>> $ python runtests.py --shell
>> $ python
>> Python 2.7.5+
>> [GCC 4.8.1] on linux2
>> >>> import numpy as np
>> >>> np.__version__
>> '1.9.0.dev-e3f0f53'
>> >>> A = np.array([[1e3, 0], [0, 1]])
>> >>> B = np.array([[1e300, 0], [0, 1]])
>> >>> C = np.array([[1e3000, 0], [0, 1]])
>> >>> np.linalg.svd(A)
>> (array([[ 1.,  0.],
>>        [ 0.,  1.]]), array([ 1000.,     1.]), array([[ 1.,  0.],
>>        [ 0.,  1.]]))
>> >>> np.linalg.svd(B)
>> (array([[ 1.,  0.],
>>        [ 0.,  1.]]), array([  1.00000000e+300,   1.00000000e+000]),
>> array([[ 1.,  0.],
>>        [ 0.,  1.]]))
>> >>> np.linalg.svd(C)
>> [hangs forever]
>> ```
>> """
>>
>> Alex
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion



More information about the NumPy-Discussion mailing list