Hi all,
Starting with 1.9.1, the official numpy OS X wheels (the ones you get by doing "pip install numpy") have been built to use Apple's Accelerate library for linear algebra. This is fast, but it breaks multiprocessing in obscure ways (e.g. see this user report: https://github.com/numpy/numpy/issues/5752).
Unfortunately, there is no obvious best solution to what linear algebra package to use, so we have to make a decision as to which set of compromises we prefer.
Options:
Accelerate: fast, but breaks multiprocessing as above.
OpenBLAS: fast, but Julian raised concerns about its trustworthiness last year ( http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069659.html). Possibly things have improved since then (I get the impression that they've gotten some additional developer attention from the Julia community), but I don't know.
Atlas: slower (faster than reference blas but definitely slower than fancy options like the above), but solid.
My feeling is that for wheels in particular it's more important that everything "just work" than that we get the absolute fastest speeds. And this is especially true for the multiprocessing issue, given that it's a widely used part of the stdlib, the failures are really obscure/confusing, and there is no workaround for python 2 which is still where a majority of our users still are. So I'd vote for using either atlas or OpenBLAS. (And would defer to Julian and Matthew about which to choose between these.)
Any opinions, objections?
-n
On 07/04/15 01:49, Nathaniel Smith wrote:
Any opinions, objections?
Accelerate does not break multiprocessing, quite the opposite. The bug is in multiprocessing and has been fixed in Python 3.4.
My vote would nevertheless be for OpenBLAS if we can use it without producing test failures in NumPy and SciPy.
Most of the test failures with OpenBLAS and Carl Kleffner's toolchain on Windows are due to differences between Microsoft and MinGW runtime libraries and not due to OpenBLAS itself. These test failures are not relevant on Mac.
ATLAS can easily reduce the speed of a matrix product or a linear algebra call with a factor of 20 compared to Accelerate, MKL or OpenBLAS. It would give us bad karma.
Sturla
Hi,
On Mon, Apr 6, 2015 at 5:13 PM, Sturla Molden sturla.molden@gmail.com wrote:
On 07/04/15 01:49, Nathaniel Smith wrote:
Any opinions, objections?
Accelerate does not break multiprocessing, quite the opposite. The bug is in multiprocessing and has been fixed in Python 3.4.
My vote would nevertheless be for OpenBLAS if we can use it without producing test failures in NumPy and SciPy.
Most of the test failures with OpenBLAS and Carl Kleffner's toolchain on Windows are due to differences between Microsoft and MinGW runtime libraries and not due to OpenBLAS itself. These test failures are not relevant on Mac.
ATLAS can easily reduce the speed of a matrix product or a linear algebra call with a factor of 20 compared to Accelerate, MKL or OpenBLAS. It would give us bad karma.
ATLAS compiled with gcc also gives us some more license complication:
http://numpy-discussion.10968.n7.nabble.com/Copyright-status-of-NumPy-binari...
I agree that big slowdowns would be dangerous for numpy's reputation.
Sturla - do you have a citable source for your factor of 20 figure?
Cheers,
Matthew
On 07/04/15 02:19, Matthew Brett wrote:
ATLAS compiled with gcc also gives us some more license complication:
http://numpy-discussion.10968.n7.nabble.com/Copyright-status-of-NumPy-binari...
Ok, then I have a question regarding OpenBLAS:
Do we use the f2c'd lapack_lite or do we build LAPACK with gfortran and link into OpenBLAS? In the latter case we might get the libquadmath linked into the OpenBLAS binary as well.
I agree that big slowdowns would be dangerous for numpy's reputation.
Sturla - do you have a citable source for your factor of 20 figure?
I will look it up. The best thing would be to do a new benchmark though.
Another thing is it depends on the hardware. ATLAS is not very scalable on multiple processors, so it will be worse on a Mac Pro than a Macbook. It will also we worse with AVX than without.
Sturla
On Apr 6, 2015 5:13 PM, "Sturla Molden" sturla.molden@gmail.com wrote:
On 07/04/15 01:49, Nathaniel Smith wrote:
Any opinions, objections?
Accelerate does not break multiprocessing, quite the opposite. The bug is in multiprocessing and has been fixed in Python 3.4.
I disagree, but it hardly matters: you can call it a bug in accelerate, or call it a bug in python, but either way it's an issue that affects our users and we need to either work around it or not.
ATLAS can easily reduce the speed of a matrix product or a linear algebra call with a factor of 20 compared to Accelerate, MKL or OpenBLAS. It would give us bad karma.
Sure, but in some cases accelerate reduces speed by a factor of infinity by hanging, and OpenBLAS may or may not give wrong answers (but quickly!) since apparently they don't do regression tests, so we have to pick our poison.
-n
On 07/04/15 02:41, Nathaniel Smith wrote:
Sure, but in some cases accelerate reduces speed by a factor of infinity by hanging, and OpenBLAS may or may not give wrong answers (but quickly!) since apparently they don't do regression tests, so we have to pick our poison.
OpenBLAS is safer on Mac than Windows (no MinGW related errors on Mac) so we should try it and see what happens.
GotoBLAS2 used to be great so it can't be that bad :-)