I've finally caught up with the discussion on aligned allocators for NumPy. In general I'm favorable to the idea, although it is not as easy to implement in 1.0.X because of the need to possibly change the C-API. The Python solution is workable and would just require a function call on the Python side (which one could call from the C-side as well with little difficulty, I believe Chuck Harris already suggested such a function). So, I think at least the Python functions are an easy addition for 1.0.4 (along with simple tests for alignment --- although a.ctypes.data % 16 is pretty simple and probably doesn't warrant a new function) I'm a bit more resistant to the more involved C-code in the patch provided with #568, because of the requested new additions to the C-API, but I understand the need. I'm currently also thinking heavily about using SIMD intrinsics in ufunc inner loops but will not likely get those in before 1.0.4. Unfortunately, all ufuncs that take advantage of SIMD instructions will have to handle the unaligned portions which may occur even if the start of the array is aligned, so the problem of thinking about alignment does not go away there with a simplified function call. A simple addition is an NPY_ALIGNED_16 and NPY_ALIGNED_32 flag for the PyArray_From_Any that could adjust the data-pointer as needed to get at least those kinds of alignment. We can't change the C-API for PyArray_FromAny to accept an alignment flag, and I'm pretty loath to do that even for 1.1. Is there a consensus? What do others think of the patch in ticket #568? Is there a need to add general-purpose aligned memory allocators to NumPy without a corresponding array_allocator? I would think the PyArray_FromAny and PyArray_NewFromDescr with aligned memory is more important which I think we could do with flag bits. -Travis O.