[Numpy-discussion] svd error checking vs. speed

josef.pktd at gmail.com josef.pktd at gmail.com
Mon Feb 17 10:41:33 EST 2014


On Mon, Feb 17, 2014 at 10:03 AM, alex <argriffi at ncsu.edu> wrote:
> On Mon, Feb 17, 2014 at 4:49 AM, Dave Hirschfeld <novin01 at gmail.com> wrote:
>> alex <argriffi <at> ncsu.edu> writes:
>>
>>>
>>> Hello list,
>>>
>>> Here's another idea resurrection from numpy github comments that I've
>>> been advised could be posted here for re-discussion.
>>>
>>> The proposal would be to make np.linalg.svd more like scipy.linalg.svd
>>> with respect to input checking.  The argument against the change is
>>> raw speed; if you know that you will never feed non-finite input to
>>> svd, then np.linalg.svd is a bit faster than scipy.linalg.svd.  An
>>> argument for the change could be to avoid issues reported on github
>>> like crashes, hangs, spurious non-convergence exceptions, etc. from
>>> the undefined behavior of svd of non-finite input.
>>>
>>> """
>>> [...] the following numpy code hangs until I `kill -9` it.
>>>
>>> ```
>>> $ python runtests.py --shell
>>> $ python
>>> Python 2.7.5+
>>> [GCC 4.8.1] on linux2
>>> >>> import numpy as np
>>> >>> np.__version__
>>> '1.9.0.dev-e3f0f53'
>>> >>> A = np.array([[1e3, 0], [0, 1]])
>>> >>> B = np.array([[1e300, 0], [0, 1]])
>>> >>> C = np.array([[1e3000, 0], [0, 1]])
>>> >>> np.linalg.svd(A)
>>> (array([[ 1.,  0.],
>>>        [ 0.,  1.]]), array([ 1000.,     1.]), array([[ 1.,  0.],
>>>        [ 0.,  1.]]))
>>> >>> np.linalg.svd(B)
>>> (array([[ 1.,  0.],
>>>        [ 0.,  1.]]), array([  1.00000000e+300,   1.00000000e+000]),
>>> array([[ 1.,  0.],
>>>        [ 0.,  1.]]))
>>> >>> np.linalg.svd(C)
>>> [hangs forever]
>>> ```
>>> """
>>>
>>> Alex
>>>
>>
>> I'm -1 on checking finiteness - if there's one place you usually want
>> maximum performance it's linear algebra operations.
>>
>> It certainly shouldn't crash or hang though and for me at least it doesn't -
>> it returns NaN
>
> btw when I use the python/numpy/openblas packaged for ubuntu, I also
> get NaN.  The infinite loop appears when I build numpy letting it use
> its lapack lite.  I don't know which LAPACK Josef uses to get the
> weird behavior he observes "13% cpu usage for a hanging process".

I use official numpy release for development, Windows, 32bit python,
i.e. MingW 3.5 and whatever old ATLAS the release includes.

a constant 13% cpu usage is 1/8 th of my 8 virtual cores.
If it were in a loop doing some work, then cpu usage fluctuates
(between 12 and 13% in a busy loop).

+/- 1

Josef

>
> This is consistent with the scipy svd docstring describing its
> check_finite flag, where it warns "Disabling may give a performance
> gain, but may result in problems (crashes, non-termination) if the
> inputs do contain infinities or NaNs."  I think this caveat also
> applies to most numpy linalg functions that connect more or less
> directly to lapack.
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion



More information about the NumPy-Discussion mailing list