[Numpy-discussion] Efficiency of Numpy wheels and simple way to benchmark Numpy installation?

Sun May 27 17:20:57 EDT 2018

Hi,

On Sun, May 27, 2018 at 9:12 PM, Nathaniel Smith <njs at pobox.com> wrote:
> Performance is an incredibly multi-dimensional thing. Modern computers are
> incredibly complex, with layers of interacting caches, different
> microarchitectural features (do you have AVX2? does your cpu's branch
> predictor interact in a funny way with your workload?), compiler
> optimizations that vary from version to version, ... and different parts of
> numpy are affected differently by an these things.
>
> So, the only really reliable answer to a question like this is, always, that
> you need to benchmark the application you actually care about in the
> contexts where it will actually run (or as close as you can get to that).
>
> That said, as a general rule of thumb, the main difference between different
> numpy builds is which BLAS library they use, which primarily affects the
> speed of numpy's linear algebra routines. The wheels on pypi use either
> OpenBLAS (on Windows and Linux), or Accelerate (in MacOS. The conda packages
> provided as part of the Anaconda distribution normally use Intel's MKL.
>
> All three of these libraries are generally pretty good. They're all serious
> attempts to make a blazing fast linear algebra library, and much much faster
> than naive implementations. Generally MKL has a reputation for being
> somewhat faster than the others, when there's a difference. But again,
> whether this happens, or is significant, for *your* app is impossible to say
> without trying it.

Yes - I'd be surprised if you find a significant difference in
performance for real usage between pip / OpenBLAS and conda / MKL -
but if you do, please let us know, and we'll investigate.

Cheers,

Matthew