Enhancement for AArch64 SVE instruction set
data:image/s3,"s3://crabby-images/30bb6/30bb69ccf0218ddf5a0e40b31730d248d5aeb4da" alt=""
Hello, I am working on speeding up NumPy with the AArch64 SVE instruction set. I could not find a numpy implementation for SVE. Is there already a test implementation or discussion about SVE support?
data:image/s3,"s3://crabby-images/c3c30/c3c3055c53fd0355e7317c7ba6fe44513e78ff96" alt=""
On 19/5/22 16:50, kawakami.k@fujitsu.com wrote:
Hello,
I am working on speeding up NumPy with the AArch64 SVE instruction set. I could not find a numpy implementation for SVE. Is there already a test implementation or discussion about SVE support?
OpenBLAS apparently has SVE support from 0.3.20 [0], which was merged to NumPy yesterday. So far we have only support for NEON and ASIMD in NumPy, see [1] for a description of the way we use intrinsics. Contributions to improve the implementations and the documentation would be welcome. We currently use travis CI to run our aarch64 tests, if there was a way to get access to more advanced machines that also would be good. Do you know of commercially available machines with SVE or SVE2 support? Matti [0] https://github.com/xianyi/OpenBLAS/blob/faf58d2b3ffb20fd334cab080700be564ef7... [1] https://numpy.org/devdocs/reference/simd/build-options.html
data:image/s3,"s3://crabby-images/30bb6/30bb69ccf0218ddf5a0e40b31730d248d5aeb4da" alt=""
Thanks for the reply. To tell the truth, I am working on a trial implementation for SVE. https://github.com/kawakami-k/numpy/commits/sve At this time, some test patterns still fail, so I'm fixing them. When I finish this work, I would like to contribute it to NumPy. As for my test environment, I use Fujitsu FX700 [0] that equips Fujitsu A64FX as its CPU. A64FX is based on Armv8.2-a and supports the SVE instruction set. I am not aware of any free SVE environment available in the cloud. If we can run Python and NumPy on qemu-aarch64 running on x64, can we do validation with qemu-aarch64? For example, oneDNN uses such method [1][2] to test its implementation for SVE. [0] https://www.hpcwire.com/2019/11/12/cray-fujitsu-both-bringing-fujitsu-a64fx-... [1] https://github.com/oneapi-src/oneDNN/blob/master/.github/automation/.drone.y... [2] https://cloud.drone.io/oneapi-src/oneDNN/1401
data:image/s3,"s3://crabby-images/e5fdd/e5fdd4ca8d23fd2d70c457ce6f8d830bf4024485" alt=""
On Fri, May 20, 2022 at 10:05 AM <kawakami.k@fujitsu.com> wrote:
Thanks for the reply.
To tell the truth, I am working on a trial implementation for SVE. https://github.com/kawakami-k/numpy/commits/sve At this time, some test patterns still fail, so I'm fixing them. When I finish this work, I would like to contribute it to NumPy.
As for my test environment, I use Fujitsu FX700 [0] that equips Fujitsu A64FX as its CPU. A64FX is based on Armv8.2-a and supports the SVE instruction set. I am not aware of any free SVE environment available in the cloud.
If we can run Python and NumPy on qemu-aarch64 running on x64, can we do validation with qemu-aarch64? For example, oneDNN uses such method [1][2] to test its implementation for SVE.
I'd say yes, we can do that. We are supporting other SIMD instructions that we don't test in CI, so if it works on qemu during PR review then I think that's enough. Clearly there's a potential for issues here when we gain more SIMD support that is not exercised in CI, so maybe we need a test strategy longer term where we can run things under qemu at least in a weekly CI job or so (qemu in regular CI isn't feasible, too slow). Cheers, Ralf
[0]
https://www.hpcwire.com/2019/11/12/cray-fujitsu-both-bringing-fujitsu-a64fx-...
[1]
https://github.com/oneapi-src/oneDNN/blob/master/.github/automation/.drone.y...
[2] https://cloud.drone.io/oneapi-src/oneDNN/1401 _______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: ralf.gommers@gmail.com
data:image/s3,"s3://crabby-images/30bb6/30bb69ccf0218ddf5a0e40b31730d248d5aeb4da" alt=""
Thank you Gommers I'd like to discuss this again when I finish SVE implementation. (It may be one month later.) Cheers, Kentaro
data:image/s3,"s3://crabby-images/e5fdd/e5fdd4ca8d23fd2d70c457ce6f8d830bf4024485" alt=""
On Thu, May 26, 2022 at 3:19 PM <kawakami.k@fujitsu.com> wrote:
Thank you Gommers
I'd like to discuss this again when I finish SVE implementation. (It may be one month later.)
Sounds great, thanks Kentaro. Cheers, Ralf
Cheers, Kentaro _______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: ralf.gommers@gmail.com
participants (3)
-
kawakami.k@fujitsu.com
-
Matti Picus
-
Ralf Gommers