[Numpy-discussion] linalg.det for fractions

Sun May 16 15:16:11 EDT 2021

On Sun, 16 May 2021 at 10:46, Eric Wieser <wieser.eric+numpy at gmail.com> wrote:
>
> Numpy implements linalg.det by going through LAPACK, which only knows about f4, f8, c8, and c16 data types.
>
> Your request amounts to wanting an `O` dtype implementation. I think this is a totally reasonable request as we already have such an implementation for `np.matmul`; but it won't be particularly easy to implement or fast, especially as it won't be optimized for fractions specifically.

Computing determinants is somewhat different from matrix
multiplication in the sense that the best algorithms depend on the
nature of the "dtype" e.g. is the arithmetic exact or approximate?
Also is division possible or desirable to use? There are division-free
and fraction-free algorithms and different strategies for pivoting
depending on whether you are trying to control bit growth or rounding
error.

What assumptions would an `O` dtype implementation of det make? Should
it be allowed to use division? If so, what kind of properties can it
assume about divisibility (e.g. true or floor division)?

> Some other options for you would be to:
>
> * use sympy's matrix operations; fractions are really just "symbolics lite"
> * Extract a common denominator from your matrix, convert the numerators to float64, and hope you don't exceed 2**52 in the result.

Is the float64 algorithm exact for integer-valued float64? Doesn't look like it:

In [33]: a = np.array([[1, 2], [3, 4]], np.float64)

In [34]: np.linalg.det(a)
Out[34]: -2.0000000000000004

Except for very small matrices and simple fractions the chances of
overflow are actually quite high. Just extracting the common
denominator could easily mean that the remaining numerators overflow.
It is also possible that the inputs and outputs might be in range but
intermediate calculations could have much larger magnitudes.

> You could improve the second option a little by implementing (and PRing) an integer loop for `det`, which would be somewhat easier than implementing the object loop.

For integers and other non-field PIDs the fraction-free Bareiss
algorithm is typically used to control bit growth:
https://en.wikipedia.org/wiki/Bareiss_algorithm

If you're looking for an off-the-shelf library that already does this
from Python then I suggest sympy. Example:

In [15]: import sympy as sym, random

In [16]: rand_fraction = lambda: sym.Rational(random.randint(-10, 10),
random.randint(1, 10))

In [17]: M = sym.Matrix([[rand_fraction() for _ in range(5)] for _ in range(5)])

In [18]: M
Out[18]:
⎡8/3     4    -1/3  -4/7  -6/5⎤
⎢                             ⎥
⎢ 0      0    7/5    3    -6/5⎥
⎢                             ⎥
⎢-6/5  -9/4   -1/5   -1   9/4 ⎥
⎢                             ⎥
⎢ 5    -9/10  -9/2  -3/2   -2 ⎥
⎢                             ⎥
⎣ 1    -5/6   5/7   9/8   -2/3⎦

In [19]: M.det()
Out[19]:
-201061211
───────────
  2205000

Another option is python_flint which is not so easy to set up and use
but wraps the flint library which is the fastest I know of for this
sort of thing.

Here is flint computing the exact determinant of a 1000x1000 matrix of
small bit-count rationals:

In [37]: import flint

In [38]: rand_fraction = lambda: flint.fmpq(random.randint(-10, 10),
random.randint(1, 10))

In [39]: M = flint.fmpq_mat([[rand_fraction() for _ in range(1000)]
for _ in range(1000)])

In [40]: %time M.det()
CPU times: user 25 s, sys: 224 ms, total: 25.2 s
Wall time: 26 s

Although the inputs are all small fractions like 3/7, 1/2, etc the
repr of the output is too long to show in this email:

In [41]: len(str(_))
Out[41]: 8440

--
Oscar