[Numpy-discussion] svd error checking vs. speed
josef.pktd at gmail.com
josef.pktd at gmail.com
Sat Feb 15 17:08:31 EST 2014
On Sat, Feb 15, 2014 at 4:56 PM, Sebastian Berg
<sebastian at sipsolutions.net> wrote:
> On Sa, 2014-02-15 at 16:37 -0500, alex wrote:
>> Hello list,
>>
>> Here's another idea resurrection from numpy github comments that I've
>> been advised could be posted here for re-discussion.
>>
>> The proposal would be to make np.linalg.svd more like scipy.linalg.svd
>> with respect to input checking. The argument against the change is
>> raw speed; if you know that you will never feed non-finite input to
>> svd, then np.linalg.svd is a bit faster than scipy.linalg.svd. An
>> argument for the change could be to avoid issues reported on github
>> like crashes, hangs, spurious non-convergence exceptions, etc. from
>> the undefined behavior of svd of non-finite input.
>>
>
> +1, unless this is a huge speed penalty, correctness (and decent error
> messages) should come first in my opinion, this is python after all. If
> this is a noticable speed difference, a kwarg may be an option (but
> would think about that some more).
maybe -1
statsmodels is using np.linalg.pinv which uses svd
I never ran heard of any crash (*), and the only time I compared with
scipy I didn't like the slowdown.
I didn't do any serious timings just a few examples.
(*) not converged, ...
pinv(x.T).dot(x) -> pinv(x.T, please_don_t_check=True).dot(y)
numbers ?
grep: we also use scipy.linalg.pinv in some cases
Josef
>
> - Sebastian
>
>> """
>> [...] the following numpy code hangs until I `kill -9` it.
>>
>> ```
>> $ python runtests.py --shell
>> $ python
>> Python 2.7.5+
>> [GCC 4.8.1] on linux2
>> >>> import numpy as np
>> >>> np.__version__
>> '1.9.0.dev-e3f0f53'
>> >>> A = np.array([[1e3, 0], [0, 1]])
>> >>> B = np.array([[1e300, 0], [0, 1]])
>> >>> C = np.array([[1e3000, 0], [0, 1]])
>> >>> np.linalg.svd(A)
>> (array([[ 1., 0.],
>> [ 0., 1.]]), array([ 1000., 1.]), array([[ 1., 0.],
>> [ 0., 1.]]))
>> >>> np.linalg.svd(B)
>> (array([[ 1., 0.],
>> [ 0., 1.]]), array([ 1.00000000e+300, 1.00000000e+000]),
>> array([[ 1., 0.],
>> [ 0., 1.]]))
>> >>> np.linalg.svd(C)
>> [hangs forever]
>> ```
>> """
>>
>> Alex
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
More information about the NumPy-Discussion
mailing list