[Numpy-discussion] deterministic, reproducible matmul / __matmult_

Mon Jul 11 13:01:49 EDT 2016

Hello

I'm a long time user of numpy - but an issue I've had with it is
making sure I can reproduce the results of a floating point matrix
multiplication in other languages/modules (like c or GPU) in another,
or across installations.   I take great pains in doing this type of
work because it allows me to both prototype with python/numpy and to
also in a fairly strong/useful capacity, use it as a strict reference
implementation.  For me - small differences accumulate such that
allclose type thinking starts failing in a few iterations of
algorithms, things diverge.

I've had success with einsum before for some cases (by chance) where
no difference was ever observed between it and Eigen (C++), but I'm
not sure if I should use this any longer.  The new @ operator is very
tempting to use too, in prototyping.

Does the ML have any ideas on how one could get a matmul that will not
allow any funny business on the evaluation of the products?  Funny
business here is something like changing the evaluation order
additions of terms. I want strict IEEE 754 compliance - no 80 bit
registers, and perhaps control of the rounding mode, no unsafe math
optimizations.

I'm definitely willing to sacrifice performance (esp multi threaded
based enhancements which already cause problems in reduction ordering)
in order to get these guarantees.  I was looking around and found a
few BLAS's that might be worth a mention, comments on these would also
be welcome:

http://bebop.cs.berkeley.edu/reproblas/
https://exblas.lip6.fr/

-Jason