[Numpy-discussion] svd error checking vs. speed

Mon Feb 17 04:49:34 EST 2014

alex <argriffi <at> ncsu.edu> writes:

> 
> Hello list,
> 
> Here's another idea resurrection from numpy github comments that I've
> been advised could be posted here for re-discussion.
> 
> The proposal would be to make np.linalg.svd more like scipy.linalg.svd
> with respect to input checking.  The argument against the change is
> raw speed; if you know that you will never feed non-finite input to
> svd, then np.linalg.svd is a bit faster than scipy.linalg.svd.  An
> argument for the change could be to avoid issues reported on github
> like crashes, hangs, spurious non-convergence exceptions, etc. from
> the undefined behavior of svd of non-finite input.
> 
> """
> [...] the following numpy code hangs until I `kill -9` it.
> 
> ```
> $ python runtests.py --shell
> $ python
> Python 2.7.5+
> [GCC 4.8.1] on linux2
> >>> import numpy as np
> >>> np.__version__
> '1.9.0.dev-e3f0f53'
> >>> A = np.array([[1e3, 0], [0, 1]])
> >>> B = np.array([[1e300, 0], [0, 1]])
> >>> C = np.array([[1e3000, 0], [0, 1]])
> >>> np.linalg.svd(A)
> (array([[ 1.,  0.],
>        [ 0.,  1.]]), array([ 1000.,     1.]), array([[ 1.,  0.],
>        [ 0.,  1.]]))
> >>> np.linalg.svd(B)
> (array([[ 1.,  0.],
>        [ 0.,  1.]]), array([  1.00000000e+300,   1.00000000e+000]),
> array([[ 1.,  0.],
>        [ 0.,  1.]]))
> >>> np.linalg.svd(C)
> [hangs forever]
> ```
> """
> 
> Alex
> 

I'm -1 on checking finiteness - if there's one place you usually want 
maximum performance it's linear algebra operations.

It certainly shouldn't crash or hang though and for me at least it doesn't - 
it returns NaN which immediately suggests to me that I've got bad input 
(maybe just because I've seen it before).

I'm not sure adding an extra kwarg is worth cluttering up the api when a 
simple call to isfinite beforehand will do the job if you think you may 
potentially have non-finite input.

Python 2.7.5 |Anaconda 1.8.0 (64-bit)| (default, Jul  1 2013, 12:37:52) [MSC 
v.1500 64 bit (AMD64)]

In [1]: import numpy as np

In [2]: >>> A = np.array([[1e3, 0], [0, 1]])
   ...: >>> B = np.array([[1e300, 0], [0, 1]])
   ...: >>> C = np.array([[1e3000, 0], [0, 1]])
   ...: >>> np.linalg.svd(A)
   ...: 
Out[2]: 
(array([[ 1.,  0.],
       [ 0.,  1.]]),
 array([ 1000.,     1.]),
 array([[ 1.,  0.],
       [ 0.,  1.]]))

In [3]: np.linalg.svd(B)
Out[3]: 
(array([[ 1.,  0.],
       [ 0.,  1.]]),
 array([  1.0000e+300,   1.0000e+000]),
 array([[ 1.,  0.],
       [ 0.,  1.]]))

In [4]: C
Out[4]: 
array([[ inf,   0.],
       [  0.,   1.]])

In [5]: np.linalg.svd(C)
Out[5]: 
(array([[ 0.,  1.],
       [ 1.,  0.]]),
 array([ nan,  nan]),
 array([[ 0.,  1.],
       [ 1.,  0.]]))

In [6]: np.__version__
Out[6]: '1.7.1'

Regards,
Dave