[Numpy-discussion] [SciPy-user] numpy aligned memory

Thu Mar 12 14:15:32 EDT 2009

On 3/12/2009 4:05 PM, Andrew Straw wrote:

> So, what's your take on having each row aligned? Is this also useful for
> FFTW, for example? If so, we should perhaps come up with a better
> routine for the cookbook.

Ok, so here is how it could be done. It fails for a reason  I'll 
attribute to a bug in NumPy.

import numpy as np

def _nextpow(b,isize):
     i = 1
     while b**i < isize:
         i += 1
     return b**i

def aligned_zeros(shape, boundary=16, dtype=float, order='C',
                   imagealign=True):

     if (not imagealign) or (not hasattr(shape,'__len__')):
         N = np.prod(shape)
         d = np.dtype(dtype)
         tmp = np.zeros(N * d.itemsize + boundary, dtype=np.uint8)
         address = tmp.__array_interface__['data'][0]
         offset = (boundary - address % boundary) % boundary
         return tmp[offset:offset+N*d.itemsize]\
             .view(dtype=d)\
             .reshape(shape, order=order)
     else:
         if order == 'C':
             ndim0 = shape[-1]
             dim0 = -1
         else:
             ndim0 = shape[0]
             dim0 = 0
         d = np.dtype(dtype)
         bshape = [i for i in shape]
         padding = boundary + _nextpow(boundary, d.itemsize) - d.itemsize
         bshape[dim0] = ndim0*d.itemsize + padding
         print bshape
         tmp = np.zeros(bshape, dtype=np.uint8, order=order)
         address = tmp.__array_interface__['data'][0]
         offset = (boundary - address % boundary) % boundary
         aligned_slice = slice(offset, offset + ndim0*d.itemsize)
     if tmp.flags['C_CONTIGUOUS']:
         tmp = tmp[..., aligned_slice]
         print tmp.shape
     else:
         tmp = tmp[aligned_slice, ...]
         print tmp.shape

     return tmp.view(dtype=dtype) # this will often fail,
                                  # probably a bug in numpy

So lets reproduce the NumPy issue:

 >>> a = zeros((10,52), dtype=uint8)
 >>> b = a[:, 3:8*2+3]
 >>> b.shape
(10, 16)
 >>> b.view(dtype=float)

Traceback (most recent call last):
   File "<pyshell#86>", line 1, in <module>
     b.view(dtype=float)
ValueError: new type not compatible with array.

However:

 >>> a = zeros((10,16), dtype=uint8)
 >>> a.view(dtype=float)
array([[ 0.,  0.],
        [ 0.,  0.],
        [ 0.,  0.],
        [ 0.,  0.],
        [ 0.,  0.],
        [ 0.,  0.],
        [ 0.,  0.],
        [ 0.,  0.],
        [ 0.,  0.],
        [ 0.,  0.]])

Until we find a way to overcome this, it will be difficult to align rows 
to particular byte boundaries. It fails even if we make sure the padding 
is a multiple of the item size:

padding = (boundary + _nextpow(boundary, d.itemsize) \
                   - d.itemsize) * d.itemsize

Very annoying..

Using allocators in libraries (e.g. FFTW) would not help either, as 
NumPy would fail in the same way.

Maybe we can force NumPy to do the right thing by hard-coding an array 
descriptor?

We can do this in Cython though, as it supports pointers and double 
indirection. But it would be like using C.

Sturla Molden