Anne,

On 8/8/07, Anne Archibald <peridot.faceted@gmail.com> wrote:
On 08/08/2007, Charles R Harris <charlesr.harris@gmail.com> wrote:
>
>
> On 8/8/07, Anne Archibald <peridot.faceted@gmail.com > wrote:
> > Oh. Well, it's not *terrible*; it gets you an aligned array. But you
> > have to allocate the original array as a 1D byte array (to allow for
> > arbitrary realignments) and then align it, reshape it, and reinterpret
> > it as a new type. Plus you're allocating an extra ndarray structure,
> > which will live as long as the new array does; this not only wastes
> > even more memory than the portable alignment solutions, it clogs up
> > python's garbage collector.
>
> The ndarray structure doesn't take up much memory, it is the data that is
> large and the data is shared between the original array and the slice. Nor
> does the data type of the slice need changing, one simply uses the desired
> type to begin with, or at least a type of the right size so that a view will
> do the job without copies. Nor do I see how the garbage collector will get
> clogged up, slices are a common feature of using numpy. The slice method
> also has the advantage of being compiler and operating system independent,
> there is a reason Intel used that approach.
>
> Aligning multidimensional arrays might indeed be complicated, but I suspect
> those complications will be easier to handle in Python than in C.

Can we assume that numpy arrays allocated to contain (say) complex64s
are aligned to a 16-byte boundary? I don't think they will
necessarily, so the shift we need may not be an integer number of
complex64s. float96s pose even more problems. So to ensure alignment,
we do need to do type conversion; if we're doing it anyway, byte
arrays require the least trust in malloc().

I think that is a safe assumption, it is probably almost as safe as assuming binary and two's complement, likely more safe than assuming ieee 784.  I expect almost all 32 bit OS's to align on 4 byte boundaries at worst, 64 bit machines to align on 8 byte boundaries. Even C structures are typically filled out with blanks to preserve some sort of alignment. That is because of addressing efficiency, or even the impossibility of odd addressing -- depends on the architecture. Sometimes even byte addressing is easier to get by putting a larger integer on the bus and extracting the relevant part. In addition, I expect the heap implementation to make some alignment decisions for efficiency.

My 64 bit linux on Intel aligns arrays, whatever the data type, on 16 byte boundaries. It might be interesting to see what happens with the Intel and MSVC comipilers, but I expect similar results. PPC's, Sun and SGI need to be checked, but I don't expect problems. I think that will cover almost all architectures numpy is likely to run on.
 
The ndarray object isn't too big, probably some twenty or thirty
bytes, so I'm not talking about a huge waste. But it is a python
object, and the garbage collector needs to walk the whole tree of
accessible python objects every time it runs, so this is one more
object on the list.

As an aside: numpy's handling of ndarray objects is actually not
ideal; if you want to exhaust memory on your system, do:

a = arange(5)
while True:
    a = a[::-1]

Well, that's a pathological case present in numpy. Fixing it doesn't seem to be a high priority although there is a ticket somewhere.

Each ndarray object keeps alive the ndarray object it is a slice of,
so this operation creates an ever-growing linked list of ndarray
objects. Seems to me it would be better to keep a pointer only to the
original object that holds the address of the buffer (so it can be
freed).

Aligning multidimensional arrays is an interesting question. To first
order, aligning the first element should be enough. If the dimensions
of the array are not divisible by the alignment, though, this means
that lower-dimensional complete slices may not be aligned:

A = aligned_empty((7,5),dtype=float,alignment=16)

Then A is aligned, as is A[0,:], but A[1,:] is not.

So in this case we might want to actually allocate an 8-by-5 array and
take a slice. This does mean it won't be contiguous in memory, so that
flattening it requires a copy (which may not wind up aligned). This is
something we might want to do - that is, make available as an option -
in python.

I think that is better viewed as need based. I suspect that if you really need such alignment it is better to start with array dimensions that will naturally align the rows. It will be impossible to naturally align all the columnes unless the data type is the correct size.

Chuck