[Numpy-discussion] High-quality memory profiling for numpy in python 3.5 / volunteers needed
njs at pobox.com
Thu Apr 17 10:18:18 EDT 2014
On 17 Apr 2014 15:09, "Aron Ahmadia" <aron at ahmadia.net> wrote:
> > On the one hand it would be nice to actually know whether
posix_memalign is important, before making api decisions on this basis.
> FWIW: On the lightweight IBM cores that the extremely popular BlueGene
machines were based on, accessing unaligned memory raised system faults.
The default behavior of these machines was to terminate the program if
more than 1000 such errors occurred on a given process, and an environment
variable allowed you to terminate the program if *any* unaligned memory
access occurred. This is because unaligned memory accesses were 15x (or
more) slower than aligned memory access.
> The newer /Q chips seem to be a little more forgiving of this, but I
think one can in general expect allocated memory alignment to be an
important performance technique for future high performance computing
Right, this much is true on lots of architectures, and so malloc is careful
to always return values with sufficient alignment (e.g. 8 bytes) to make
sure that any standard operation can succeed.
The question here is whether it will be important to have *even more*
alignment than malloc gives us by default. A 16 or 32 byte wide SIMD
instruction might prefer that data have 16 or 32 byte alignment, even if
normal memory access for the types being operated on only requires 4 or 8
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the NumPy-Discussion