
On Wed, May 31, 2023 at 12:28 PM Chris Sidebottom <chris.sidebottom@arm.com> wrote:
Matthew Brett wrote:
Hi, On Wed, May 31, 2023 at 8:40 AM Matti Picus matti.picus@gmail.com wrote:
On 31/5/23 09:33, Jerome Kieffer wrote: Hi Sebastian, I had a quick look at the PR and it looks like you re-implemented the sin-cos function using SIMD. I wonder how it compares with SLEEF (header only library, CPU-architecture agnostic SIMD implementation of transcendental functions with precision validation). SLEEF is close to the Intel SVML library in spirit but extended to multi-architecture (tested on PowerPC and ARM for example). This is just curiosity ... Like Juan, I am afraid of this change since my code, which depends on numpy for sin/cos used for rotation is likely to see large change of behavior. Cheers, Jerome I think we should revert the changes. They have proved to be disruptive, and I am not sure the improvement is worth the cost. The reversion should add a test that cements the current user expectations. The path forward is a different discussion, but for the 1.25 release I think we should revert. Is there a way to make the changes opt-in for now, while we go back to see if we can improve the precision?
This would be similar to the approach libmvec is taking ( https://sourceware.org/glibc/wiki/libmvec), adding the `--disable-mathvec` option, although they favour the 4ULP variants rather than the higher accuracy ones by default. If someone can advise as to the most appropriate place for such a toggle I can look into adding it, I would prefer for the default to be 4ULP to match libc though.
We have a build-time toggle for SVML (`disable-svml` in `meson_options.txt` and an `NPY_DISABLE_SVML` environment variable for the distutils build). This one should look similar I think - and definitely not separate Python API with `np.fastmath` or similar. The flag can then default to the old (higher-precision, slower) behavior for <2.0, and the fast version for
=2.0 somewhere halfway through the 2.0 development cycle - assuming the tweak in precision that Sebastian suggests is possible will remove the worst accuracy impacts that have now been identified.
The `libmvec` link above is not conclusive it seems to me Chris, given that the examples specify that one only gets the faster version with `-ffast-math`, hence it's off by default. Cheers, Ralf