[Numpy-discussion] Profiling (was GSoC : Performance parity between numpy arrays and Python scalars)

Francesc Alted francesc at continuum.io
Thu May 2 10:51:46 EDT 2013


On 5/2/13 3:58 PM, Nathaniel Smith wrote:
> callgrind has the *fabulous* kcachegrind front-end, but it only
> measures memory access performance on a simulated machine, which is
> very useful sometimes (if you're trying to optimize cache locality),
> but there's no guarantee that the bottlenecks on its simulated machine
> are the same as the bottlenecks on your real machine.

Agreed, there is no guarantee, but my experience is that kcachegrind 
normally gives you a pretty decent view of cache faults and hence it can 
do pretty good predictions on how this affects your computations.  I 
have used this feature extensively for optimizing parts of the Blosc 
compressor, and I cannot be more happier (to the point that, if it were 
not for Valgrind, I could not figure out many interesting memory access 
optimizations).

-- 
Francesc Alted




More information about the NumPy-Discussion mailing list