[Numpy-discussion] NEP 38 - Universal SIMD intrinsics
Matti Picus
matti.picus at gmail.com
Wed Feb 12 02:19:14 EST 2020
On 11/2/20 8:02 pm, Devulapalli, Raghuveer wrote:
>
> On top of that the performance implications aren’t clear. Software
> implementations of hardware instructions might perform worse and might
> not even produce the same result.
>
The proposal for universal intrinsics does not enable replacing an
intrinsic on one platform with a software emulation on another: the
intrinsics are meant to be compile-time defines that overlay the
universal intrinsic with a platform specific one. In order to use a new
intrinsic, it must have parallel intrinsics on the other platforms, or
cannot be used there: "NPY_CPU_HAVE(FEATURE_NAME)" will always return
false so the compiler will not even build a loop for that platform. I
will try to clarify that intention in the NEP.
I hope there will not be a demand to use many non-universal intrinsics
in ufuncs, we will need to work this out on a case-by-case basis in each
ufunc. Does that sound reasonable? Are there intrinsics you have already
used that have no parallel on other platforms?
Matti
More information about the NumPy-Discussion
mailing list