On 11/07/2012 03:30 PM, Neal Becker wrote:
David Cournapeau wrote:
On Wed, Nov 7, 2012 at 1:56 PM, Neal Becker
wrote: David Cournapeau wrote:
On Wed, Nov 7, 2012 at 12:35 PM, Neal Becker
wrote: I'm trying to do a bit of benchmarking to see if amd libm/acml will help me.
I got an idea that instead of building all of numpy/scipy and all of my custom modules against these libraries, I could simply use:
LD_PRELOAD=/opt/amdlibm-3.0.2/lib/dynamic/libamdlibm.so:/opt/acml5.2.0/gfortran64/lib/libacml.so
<my program here>
I'm hoping that both numpy and my own dll's then will take advantage of these libraries.
Do you think this will work?
Quite unlikely depending on your configuration, because those libraries are rarely if ever ABI compatible (that's why it is such a pain to support).
David
When you say quite unlikely (to work), you mean
a) unlikely that libm/acml will be used to resolve symbols in numpy/dlls at runtime (e.g., exp)?
or
b) program may produce wrong results and/or crash ?
Both, actually. That's not something I would use myself. Did you try openblas ? It is open source, simple to build, and is pretty fast,
David
In my current work, probably the largest bottlenecks are 'max*', which are
log (\sum e^(x_i))
numexpr with Intel VML is the solution I know of that doesn't require you to dig into compiling C code yourself. Did you look into that or is using Intel VML/MKL not an option? Fast exps depend on the CPU evaluating many exp's at the same time (both explicit through vector registers, and implicit through pipelining); even if you get what you try to work (which is unlikely I think) the approach is inherently slow, since just passing a single number at the time through the "exp" function can't be efficient. Dag Sverre