[Numpy-discussion] Byte aligned arrays

Nathaniel Smith njs at pobox.com
Wed Dec 19 10:10:24 EST 2012

On Wed, Dec 19, 2012 at 2:57 PM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
> On Wed, Dec 19, 2012 at 7:43 AM, Nathaniel Smith <njs at pobox.com> wrote:
>> On Wed, Dec 19, 2012 at 8:40 AM, Henry Gomersall <heng at cantab.net> wrote:
>> > I've written a few simple cython routines for assisting in creating
>> > byte-aligned numpy arrays. The point being for the arrays to work with
>> > SSE/AVX code.
>> >
>> > https://github.com/hgomersall/pyFFTW/blob/master/pyfftw/utils.pxi
>> >
>> > The change recently has been to add a check on the CPU as to what flags
>> > are supported (though it's not complete, I should make the default
>> > return 0 or something).
>> >
>> > It occurred to me that this is something that (a) other people almost
>> > certainly need and are solving themselves and (b) I lack the necessary
>> > platforms to test all the possible CPU/OS combinations to make sure
>> > something sensible happens in all cases.
>> >
>> > Is this something that can be rolled into Numpy (the feature, not my
>> > particular implementation or interface - though I'd be happy for it to
>> > be so)?
>> >
>> > Regarding (b), I've written a test case that works for Linux on x86-64
>> > with GCC (my platform!). I can test it on 32-bit windows, but that's it.
>> > Is ARM supported by Numpy? Neon would be great to include as well. What
>> > other platforms might need this?
>> Your code looks simple and portable to me (at least the alignment
>> part). I can see a good argument for adding this sort of functionality
>> directly to numpy with a nice interface, though, since these kind of
>> requirements seem quite common these days. Maybe an interface like
>>   a = np.asarray([1, 2, 3], base_alignment=32)  # should this be in
>> bits or in bytes?
>>   b = np.empty((10, 10), order="C", base_alignment=32)
>>   # etc.
>>   assert a.base_alignment == 32
>> which underneath tries to use posix_memalign/_aligned_malloc when
>> possible, or falls back on the overallocation trick otherwise?
> There is a thread about this from several years back. IIRC, David Cournapeau
> was interested in the same problem. At first glance, the alignment keyword
> looks interesting. One possible concern is keeping alignment for rows,
> views, etc., which is probably not possible in any sensible way. But people
> who need this most likely know what they are doing and just need memory
> allocated on the proper boundary.

Right, my intuition is that it's like order="C" -- if you make a new
array by, say, indexing, then it may or may not have order="C", no
guarantees. So when you care, you call asarray(a, order="C") and that
either makes a copy or not as needed. Similarly for base alignment.

I guess to push this analogy even further we could define a set of
array flags, ALIGNED_8, ALIGNED_16, etc. (In practice only power-of-2
alignment matters, I think, so the number of flags would remain
manageable?) That would make the C API easier to deal with too, no
need to add PyArray_FromAnyAligned.


More information about the NumPy-Discussion mailing list