[Numpy-discussion] Linking Numpy with parallel OpenBLAS
jtaylor.debian at googlemail.com
Thu Oct 29 17:07:57 EDT 2015
On 29.10.2015 21:50, Daπid wrote:
> On 29 October 2015 at 20:25, Julian Taylor
> <jtaylor.debian at googlemail.com <mailto:jtaylor.debian at googlemail.com>>
> should be possible by putting this into: ~/.numpy-site.cfg
> libraries = openblasp
> LD_PRELOAD the file should also work.
> I did some timings on a dot product of a square matrix of size 10000
> with LD_PRELOADing the different versions. I checked that all the cores
> were crunching when an other than plain libopenblas/64 was selected.
> Here are the timings in seconds:
> Intel i5-3317U:
> 97.5418870449 <tel:5418870449>
> Intel i7-4770:
> Both computers have the same software and OS. So, it seems that openblas
> doesn't get a significant advantage from going parallel in the older i5;
> the i7 using all its cores (4 + 4 hyperthread) gains a 3x speed up, and
> there is no big different between OpenMP and pthreads.
> I am particullary puzzled by the i5 results, shouldn't threads get a
> noticeable speedup?
Try with only 2 cores instead of the 2+2 via OMP_NUM_THREADS=2, its
possible the hyperthreading is just leading to cache trashing.
Also when only one core is active the cpus will overclock themselves a
bit which will decrease relative parallelization speedups (intel turbo
More information about the NumPy-Discussion