[Numpy-discussion] numpy arrays, data allocation and SIMD alignement

Anne Archibald peridot.faceted at gmail.com
Wed Aug 8 14:23:44 EDT 2007

On 08/08/2007, Charles R Harris <charlesr.harris at gmail.com> wrote:
> On 8/8/07, Anne Archibald <peridot.faceted at gmail.com> wrote:
> > Oh. Well, it's not *terrible*; it gets you an aligned array. But you
> > have to allocate the original array as a 1D byte array (to allow for
> > arbitrary realignments) and then align it, reshape it, and reinterpret
> > it as a new type. Plus you're allocating an extra ndarray structure,
> > which will live as long as the new array does; this not only wastes
> > even more memory than the portable alignment solutions, it clogs up
> > python's garbage collector.
> The ndarray structure doesn't take up much memory, it is the data that is
> large and the data is shared between the original array and the slice. Nor
> does the data type of the slice need changing, one simply uses the desired
> type to begin with, or at least a type of the right size so that a view will
> do the job without copies. Nor do I see how the garbage collector will get
> clogged up, slices are a common feature of using numpy. The slice method
> also has the advantage of being compiler and operating system independent,
> there is a reason Intel used that approach.
> Aligning multidimensional arrays might indeed be complicated, but I suspect
> those complications will be easier to handle in Python than in C.

Can we assume that numpy arrays allocated to contain (say) complex64s
are aligned to a 16-byte boundary? I don't think they will
necessarily, so the shift we need may not be an integer number of
complex64s. float96s pose even more problems. So to ensure alignment,
we do need to do type conversion; if we're doing it anyway, byte
arrays require the least trust in malloc().

The ndarray object isn't too big, probably some twenty or thirty
bytes, so I'm not talking about a huge waste. But it is a python
object, and the garbage collector needs to walk the whole tree of
accessible python objects every time it runs, so this is one more
object on the list.

As an aside: numpy's handling of ndarray objects is actually not
ideal; if you want to exhaust memory on your system, do:

a = arange(5)
while True:
    a = a[::-1]

Each ndarray object keeps alive the ndarray object it is a slice of,
so this operation creates an ever-growing linked list of ndarray
objects. Seems to me it would be better to keep a pointer only to the
original object that holds the address of the buffer (so it can be

Aligning multidimensional arrays is an interesting question. To first
order, aligning the first element should be enough. If the dimensions
of the array are not divisible by the alignment, though, this means
that lower-dimensional complete slices may not be aligned:

A = aligned_empty((7,5),dtype=float,alignment=16)

Then A is aligned, as is A[0,:], but A[1,:] is not.

So in this case we might want to actually allocate an 8-by-5 array and
take a slice. This does mean it won't be contiguous in memory, so that
flattening it requires a copy (which may not wind up aligned). This is
something we might want to do - that is, make available as an option -
in python.


More information about the NumPy-Discussion mailing list