PR to add get_array_bandwidth, issymmetric, ishermitian functions

Dear all, There is now a PR for scipy.linalg module including functions mentioned in the title. The idea is actually to enable a central place to provide these rather occasionally needed functionalities with some basic benchmarking available. If and when these are in then we can work on the context aware eig, solve functions much easier delegating the right structure to the right algorithm. Feedback is most welcome. https://github.com/scipy/scipy/pull/14701 best, ilhan

Update mail for the aforementioned PR. After using the hint from Eric Moore, it got slightly faster but probably there is more room for improvement. Evgeni raised the issue of approximate checks instead of identical checks for zero, mostly the use case of an array contaminated with numerical noise. I am not sure about the business case for it but I'd like to fish for some feedback here too. The main issues I can see is that int types have no room for noise and also the tolerance adjustment would be quite difficult to pull off since it would depend on the magnitudes of the nonzero entries to decide what constitutes as noise. Finally, though I started this in SciPy, I still have some doubts whether this should go into NumPy or not. But we can do that anytime later. If it looks OK then I'd like to put these in well before the 1.8 deadline since I want to attempt to get the generic linalg functions solve, eig, and svd context aware, that is to say using these functions eig will dispatch to eigh or triangular solvers, linalg.solve will dispatch to different solvers depending on the structure etc. Hence any feedback is welcome. Best, ilhan On Wed, Sep 8, 2021 at 1:25 PM Ilhan Polat <ilhanpolat@gmail.com> wrote:
Dear all,
There is now a PR for scipy.linalg module including functions mentioned in the title. The idea is actually to enable a central place to provide these rather occasionally needed functionalities with some basic benchmarking available. If and when these are in then we can work on the context aware eig, solve functions much easier delegating the right structure to the right algorithm.
Feedback is most welcome.
https://github.com/scipy/scipy/pull/14701
best, ilhan

I am not sure about the business case for it but I'd like to fish for some feedback here too.
I hit this sort of thing often working with floating point data, for example with a simple left-and-right multiplies of some matrix (and its transpose, respectively) that should preserve symmetry:
import numpy as np rng = np.random.RandomState(0) x = rng.randn(10, 10) y = x @ x.T # symmetric np.array_equal(y, y.T) True p = rng.randn(10, 10) z = p @ y @ p.T # this should still be symmetric np.array_equal(z, z.T) False np.allclose(z, z.T) # it is symmetric to numerical precision True z[0, 1] 7.912658519091485 z[1, 0] 7.9126585190914875
The main issues I can see is that int types have no room for noise and also
the tolerance adjustment would be quite difficult to pull off since it would depend on the magnitudes of the nonzero entries to decide what constitutes as noise.
I would vote for / expect `atol=0, rtol=1e-7` keyword arguments (maybe with different defaults) to have an API similar to `np.allclose`. Here I chose the defaults for `assert_allclose` because I have found in practice the `atol=1e-8` to be a bit dangerous / lead to silent bugs in people's code. Finally, though I started this in SciPy, I still have some doubts whether
this should go into NumPy or not. But we can do that anytime later.
To me NumPy would be a more natural fit for this sort of thing, and I would enjoy having `np.testing.*` variants of these things, too, for code development in other downstream libraries. My 2c, Eric

On Wed, Sep 15, 2021 at 5:55 PM Eric Larson <larson.eric.d@gmail.com> wrote:
I am not sure about the business case for it but I'd like to fish for some
feedback here too.
I hit this sort of thing often working with floating point data, for example with a simple left-and-right multiplies of some matrix (and its transpose, respectively) that should preserve symmetry:...
Yes I think you convinced me about this (for the array bandwidth still not quite sure though, very difficult to pull it off with noise and still being fast)
The main issues I can see is that int types have no room for noise and
also the tolerance adjustment would be quite difficult to pull off since it would depend on the magnitudes of the nonzero entries to decide what constitutes as noise.
I would vote for / expect `atol=0, rtol=1e-7` keyword arguments (maybe with different defaults) to have an API similar to `np.allclose`. Here I chose the defaults for `assert_allclose` because I have found in practice the `atol=1e-8` to be a bit dangerous / lead to silent bugs in people's code.
This is precisely why I am hesitating. I am a big fan of "People should be allowed to shoot themselves in the foot" but I would suggest the defaults to be zero for both to delegate all responsibility to the user. Because if we define it, it's going to get quite messy in the bug reports.
Finally, though I started this in SciPy, I still have some doubts whether
this should go into NumPy or not. But we can do that anytime later.
To me NumPy would be a more natural fit for this sort of thing, and I would enjoy having `np.testing.*` variants of these things, too, for code development in other downstream libraries.
Yes, that would be my preference too but I can't crack the C code no matter how hard I try (and I tried quite a few times). So I think they have to chime in.
My 2c, Eric
Thanks!

Hi again, I added the inexact comparison (rather shamelessly) with `np.allclose` usage for two reasons. i) I'd do exactly the same thing for every dtype, had I implemented myself ii) allclose is already probably as fast as it would get since rtol is a bit tricky to implement. Added some notes regarding this in the docstrings. A passing by comment, there are lots of low-hanging fruits in this collection of functions mostly early exits and also zero-detection of rtol. I'll make a PR for it soon. For get_array_bandwidth I think it is not a good idea (loosely held!) since it is even more trickier and with very little return. Instead cleaning up with a[abs(a)<tol] = 0. and then doing a bandwidth sweep is way faster (on my machine). In the sym/her detection there is not much of a one liner like that other than A + A.T / 2 type of intensive ops.Hence I left it out. as usual feedback welcome. ilhan On Wed, Sep 15, 2021 at 10:35 PM Ilhan Polat <ilhanpolat@gmail.com> wrote:
On Wed, Sep 15, 2021 at 5:55 PM Eric Larson <larson.eric.d@gmail.com> wrote:
I am not sure about the business case for it but I'd like to fish for
some feedback here too.
I hit this sort of thing often working with floating point data, for example with a simple left-and-right multiplies of some matrix (and its transpose, respectively) that should preserve symmetry:...
Yes I think you convinced me about this (for the array bandwidth still not quite sure though, very difficult to pull it off with noise and still being fast)
The main issues I can see is that int types have no room for noise and
also the tolerance adjustment would be quite difficult to pull off since it would depend on the magnitudes of the nonzero entries to decide what constitutes as noise.
I would vote for / expect `atol=0, rtol=1e-7` keyword arguments (maybe with different defaults) to have an API similar to `np.allclose`. Here I chose the defaults for `assert_allclose` because I have found in practice the `atol=1e-8` to be a bit dangerous / lead to silent bugs in people's code.
This is precisely why I am hesitating. I am a big fan of "People should be allowed to shoot themselves in the foot" but I would suggest the defaults to be zero for both to delegate all responsibility to the user. Because if we define it, it's going to get quite messy in the bug reports.
Finally, though I started this in SciPy, I still have some doubts whether
this should go into NumPy or not. But we can do that anytime later.
To me NumPy would be a more natural fit for this sort of thing, and I would enjoy having `np.testing.*` variants of these things, too, for code development in other downstream libraries.
Yes, that would be my preference too but I can't crack the C code no matter how hard I try (and I tried quite a few times). So I think they have to chime in.
My 2c, Eric
Thanks!
participants (2)
-
Eric Larson
-
Ilhan Polat