[Numpy-discussion] is_triangular, is_diagonal, is_symmetric et al. in NumPy or SciPy linalg

Ilhan Polat ilhanpolat at gmail.com
Tue Jun 29 18:21:55 EDT 2021

Dear all,

I'm writing some helper Cythpm functions for scipy.linalg which is kinda
performant and usable. And there is still quite some wiggle room for more.

In many linalg routines there is a lot of performance benefit if the
structure can be discovered in a cheap and reliable way at the outset. For
example if symmetric then eig can delegate to eigh or if triangular then
triangular solvers can be used in linalg.solve and lstsq so forth

Here is the Cythonized version for Jupyter notebook to paste to discover
the lower/upper bandwidth of square array A that competes well with A != 0
just to use some low level function (note the latter returns an array hence
more cost is involved) There is a higher level supervisor function that
checks C-contiguousness otherwise specializes to different versions of it

Initial cell

%load_ext Cython
%load_ext line_profiler
import cython
import line_profiler

Then another cell

# cython: language_level=3
# cython: linetrace=True
# cython: binding = True
# distutils: define_macros=CYTHON_TRACE=1
# distutils: define_macros=CYTHON_TRACE_NOGIL=1

cimport cython
cimport numpy as cnp
import numpy as np
import line_profiler
ctypedef fused np_numeric_t:

cpdef inline (int, int) band_check_internal(np_numeric_t[:, ::1]A):
    cdef Py_ssize_t n = A.shape[0], lower_band = 0, upper_band = 0, r, c
    cdef np_numeric_t zero = 0

    for r in xrange(n):
        # Only bother if outside the existing band:
        for c in xrange(r-lower_band):
            if A[r, c] != zero:
                lower_band = r - c

        for c in xrange(n - 1, r + upper_band, -1):
            if A[r, c] != zero:
                upper_band = c - r

    return lower_band, upper_band

Final cell for use-case ---------------

# Make arbitrary lower-banded array
n = 50 # array size
k = 3 # k'th subdiagonal
R = np.zeros([n, n], dtype=np.float32)
R[[x for x in range(n)], [x for x in range(n)]] = 1
R[[x for x in range(n-1)], [x for x in range(1,n)]] = 1
R[[x for x in range(1,n)], [x for x in range(n-1)]] = 1
R[[x for x in range(k,n)], [x for x in range(n-k)]] = 2

Some very haphazardly put together metrics

%timeit band_check_internal(R)
2.59 µs ± 84.7 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

%timeit np.linalg.solve(R, zzz)
824 µs ± 6.24 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%timeit R != 0.
1.65 µs ± 43.1 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

So the worst case cost is negligible in general (note that the given code
is slower as it uses the fused type however if I go with tempita standalone
version is faster)

Two questions:

1) This is missing np.half/float16 functionality since any arithmetic with
float16 is might not be reliable including nonzero check. IS it safe to
view it as np.uint16 and use that specialization? I'm not sure about the
sign bit hence the question. I can leave this out since almost all linalg
suite rejects this datatype due to well-known lack of supprt.

2) Should this be in NumPy or SciPy linalg? It is quite relevant to be on
SciPy but then again this stuff is purely about array structures. But if
the opinion is for NumPy then I would need a volunteer because NumPy
codebase flies way above my head.

All feedback welcome

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.python.org/pipermail/numpy-discussion/attachments/20210630/d076c49e/attachment.html>

More information about the NumPy-Discussion mailing list