On 8/9/07, Charles R Harris <charlesr.harris@gmail.com> wrote:


On 8/8/07, David Cournapeau < david@ar.media.kyoto-u.ac.jp> wrote:
Charles R Harris wrote:
> Anne,
>
> On 8/8/07, *Anne Archibald* <peridot.faceted@gmail.com
> <mailto: peridot.faceted@gmail.com>> wrote:
>
>     On 08/08/2007, Charles R Harris <charlesr.harris@gmail.com
>     <mailto: charlesr.harris@gmail.com>> wrote:
>     >
>     >
>     > On 8/8/07, Anne Archibald < peridot.faceted@gmail.com
>     <mailto: peridot.faceted@gmail.com>> wrote:
>     > > Oh. Well, it's not *terrible*; it gets you an aligned array.
>     But you
>     > > have to allocate the original array as a 1D byte array (to
>     allow for
>     > > arbitrary realignments) and then align it, reshape it, and
>     reinterpret
>     > > it as a new type. Plus you're allocating an extra ndarray
>     structure,
>     > > which will live as long as the new array does; this not only
>     wastes
>     > > even more memory than the portable alignment solutions, it
>     clogs up
>     > > python's garbage collector.
>     >
>     > The ndarray structure doesn't take up much memory, it is the
>     data that is
>     > large and the data is shared between the original array and the
>     slice. Nor
>     > does the data type of the slice need changing, one simply uses
>     the desired
>     > type to begin with, or at least a type of the right size so that
>     a view will
>     > do the job without copies. Nor do I see how the garbage
>     collector will get
>     > clogged up, slices are a common feature of using numpy. The
>     slice method
>     > also has the advantage of being compiler and operating system
>     independent,
>     > there is a reason Intel used that approach.
>
I am not sure to understand which approach to which problem you are
talking about here ?

IMHO, the discussion is becoming a bit carried away. What I was
suggesting is
    - being able to check whether a given data buffer is aligned to a
given alignment (easy)
    - being able to request an aligned data buffer: requires aligned
memory allocators, and some additions to the API for creating arrays.

This all boils down to the following case: I have a C function which
requires N bytes aligned data, I want the numpy API to provide this
capability. I don't understand the discussion on doing it in python:

Well, what you want might be very easy to do in python, we just need to check the default alignments for doubles and floats for some of the other compilers, architectures, and OS's out there. On the other hand, you might not be able to request a c malloc that is aligned in a portable way without resorting to the same tricks as you do in python. So why not use python and get the reference counting and garbage collection along with it? What we want are doubles 8 byte aligned and floats 4 byte aligned. That seems to be the case with gcc, linux, and the Intel architecture. The idea is to create a slightly oversize array, then use a slice of the proper size that is 16 byte aligned.

Chuck

For instance, in the case of  linux-x86 and linux-x86_64, the following should work:

In [68]: def align16(n,dtype=float64) :
   ....:     size = dtype().dtype.itemsize
   ....:     over = 16/size
   ....:     data = empty(n + over, dtype=dtype)
   ....:     skip = (- data.ctypes.data % 16)/size
   ....:     return data[skip:skip + n]

Of course, now you need to fill in the data.

Chuck