Re: [Numpy-discussion] Python ctypes and OpenMP mystery

Feb. 17, 2011


      A Thursday 17 February 2011 02:24:33 Eric Carlson escrigué:
...
Hello Francesc,
The problem appears to related to my lack of optimization in the
compilation. If I use
gcc -O3 -c my_lib.c -fPIC -fopenmp -ffast-math
the C executable and ctypes/python versions behave almost
identically.
Ahh, good to know.
...
Getting decent behavior takes some thought, though, far
from the incredible almost-automatic behavior of numexpr.
numexpr uses a very simple method for distributing load among the 
threads, so I suppose this is why it is fast.  The drawback is that 
numexpr only can be used for operations implying the same index (i.e. 
like a+b**3, but not for things like a[i+1]+b[i]**3).  For other 
operations openmp is probably the best option (I should say the 
*easiest* option) right now.
...
Now I've got to figure out how to scale up a bunch of vector
adds/multiplies. Neither numexpr or openmp get you very far with a
bunch of "z=a*x+b*y"-type calcs.
For these sort of computations you are most probably hitting the memory 
bandwidth wall, so you are out of luck (at least until processors will 
be fast enough to allow compression to actually reduce the time spent in 
computations).

Cheers,

-- 
Francesc Alted