[Numpy-discussion] NEP 38 - Universal SIMD intrinsics

Matti Picus matti.picus at gmail.com
Wed Feb 12 02:19:14 EST 2020


On 11/2/20 8:02 pm, Devulapalli, Raghuveer wrote:
>
> On top of that the performance implications aren’t clear. Software 
> implementations of hardware instructions might perform worse and might 
> not even produce the same result.
>

The proposal for universal intrinsics does not enable replacing an 
intrinsic on one platform with a software emulation on another: the 
intrinsics are meant to be compile-time defines that overlay the 
universal intrinsic with a platform specific one. In order to use a new 
intrinsic, it must have parallel intrinsics on the other platforms, or 
cannot be used there: "NPY_CPU_HAVE(FEATURE_NAME)" will always return 
false so the compiler will not even build a loop for that platform. I 
will try to clarify that intention in the NEP.


I hope there will not be a demand to use many non-universal intrinsics 
in ufuncs, we will need to work this out on a case-by-case basis in each 
ufunc. Does that sound reasonable? Are there intrinsics you have already 
used that have no parallel on other platforms?


Matti



More information about the NumPy-Discussion mailing list