[Distutils] Status update on the NumPy & SciPy vs SSE problem?
Antoine Pitrou
solipsis at pitrou.net
Thu Feb 4 06:42:19 EST 2016
On Thu, 4 Feb 2016 21:22:32 +1000
Nick Coghlan <ncoghlan at gmail.com> wrote:
>
> I figured that was independent of the manylinux PEP (since it affects
> Windows as well), but I'm also curious as to the current status (I
> found a couple of apparently relevant threads on the NumPy list, but
> figured it made more sense to just ask for an update rather than
> trusting my Google-fu)
While I'm not a Numpy maintainer, I don't think you can go much further
than SSE2 (which is standard under the x86-64 definition).
One factor is support by the kernel. The CentOS 5 kernel doesn't
seem to support AVX, so you can't use AVX there even if your processor
supports it (as the registers aren't preserved accross context
switches). And one design point of manylinux is to support old Linux
setups... (*)
There are intermediate ISA additions between SSE2 and AVX (additions
that don't require OS support), but I'm not sure they help much on
compiler-vectorized code as opposed to hand-written assembly. Numpy's
pre-compiled loops are typically quite straightforward as far as I've
seen.
One mitigation is to delegate some operations to an optimized library
implementing the appropriate runtime switches: for example linear
algebra is delegated by Numpy and Scipy to optimized BLAS and LINPACK
libraries (which exist in various implementations such as OpenBLAS or
Intel's MKL).
(*) (this is an issue a JIT compiler helps circumvent: it generates
optimal code for the current CPU ;-))
Regards
Antoine.
More information about the Distutils-SIG
mailing list