On 11/10/21 11:05 pm, Jerome Kieffer wrote:
On Mon, 11 Oct 2021 18:04:58 +0300 Matti Picus <matti.picus@gmail.com> wrote:
As SciPy already found out, some downstream libraries may need to tweak their tolerances for some functions as a result of this PR. We wanted to put it in early enough in the release cycle so that we can back it out fully or partially if the accuracy degradation is too large, so please speak up if you notice anything strange. Thanks for warning in advance... now, we need find some computers to test those versions. Do you know if it works "the same" with AVX2 ? since most computers have AVX2 and for now you need the latest servers to test AVX512.
Cheers,
Jerome
Short answer: the code path should be exactly the same on machines without AVX512 before and after this PR. Long answer: The use of intrinsics for ufunc loops is mostly described in the docs [0] . When calling a ufunc loop, a dispatch mechanism chooses the appropriate compiled loop for the available intrinsics on the system. You can see which intrinsics are supported on your installation of NumPy (new for 1.22) by using numpy.show_config(). The last few rows show which intrinsics are built into numpy and can possibly be used, and which subset is detected and will be used. This means we ship multiple variants of loops, and only one set will be used on each machine. So a machine without AVX512 will continue to use whatever loop it used before this PR. Matti [0] https://numpy.org/devdocs/reference/simd/simd-optimizations.html