[Numpy-discussion] Byte aligned arrays
heng at cantab.net
Thu Dec 20 03:12:28 EST 2012
On Wed, 2012-12-19 at 15:10 +0000, Nathaniel Smith wrote:
> >> > Is this something that can be rolled into Numpy (the feature, not
> >> > particular implementation or interface - though I'd be happy for
> it to
> >> > be so)?
> >> >
> >> > Regarding (b), I've written a test case that works for Linux on
> >> > with GCC (my platform!). I can test it on 32-bit windows, but
> that's it.
> >> > Is ARM supported by Numpy? Neon would be great to include as
> well. What
> >> > other platforms might need this?
> >> Your code looks simple and portable to me (at least the alignment
> >> part). I can see a good argument for adding this sort of
> >> directly to numpy with a nice interface, though, since these kind
> >> requirements seem quite common these days. Maybe an interface like
> >> a = np.asarray([1, 2, 3], base_alignment=32) # should this be in
> >> bits or in bytes?
> >> b = np.empty((10, 10), order="C", base_alignment=32)
> >> # etc.
> >> assert a.base_alignment == 32
> >> which underneath tries to use posix_memalign/_aligned_malloc when
> >> possible, or falls back on the overallocation trick otherwise?
> > There is a thread about this from several years back. IIRC, David
> > was interested in the same problem. At first glance, the alignment
> > looks interesting. One possible concern is keeping alignment for
> > views, etc., which is probably not possible in any sensible way. But
> > who need this most likely know what they are doing and just need
> > allocated on the proper boundary.
> Right, my intuition is that it's like order="C" -- if you make a new
> array by, say, indexing, then it may or may not have order="C", no
> guarantees. So when you care, you call asarray(a, order="C") and that
> either makes a copy or not as needed. Similarly for base alignment.
> I guess to push this analogy even further we could define a set of
> array flags, ALIGNED_8, ALIGNED_16, etc. (In practice only power-of-2
> alignment matters, I think, so the number of flags would remain
> manageable?) That would make the C API easier to deal with too, no
> need to add PyArray_FromAnyAligned.
So, if I were to implement this, I presume the proper way would be
through modifications to multiarray?
Would this basic description be a reasonable initial target?
More information about the NumPy-Discussion