[Numpy-discussion] Problems with Numexpr and discontiguous arrays
Tim Hochberg
tim.hochberg at ieee.org
Thu Oct 5 09:24:42 EDT 2006
Tim Hochberg wrote:
> David M. Cooke wrote:
>> On Wed, 04 Oct 2006 10:19:08 -0700
>> Tim Hochberg <tim.hochberg at ieee.org> wrote:
>>
>>
>>> Ivan Vilata i Balaguer wrote:
>>>
>>>> It seemed that discontiguous arrays worked OK in Numexpr since
>>>> r1977 or
>>>> so, but I have come across some alignment or striding problems
>>>> which can
>>>> be seen with the following code::
>>>>
>>> I looked at this just a little bit and clearly this bit from
>>> interp_body cannot work in the presence of recor arrays:
>>>
>>> //....
>>> intp sf1 = sb1 / sizeof(double); \
>>> //...
>>> #define f1 ((double *)x1)[j*sf1]
>>>
>>>
>>> There are clearly some assumptions that sb1 is evenly divisible by
>>> sizeof(double). Blech!. This is likely my fault, and I expect it
>>> won't be too horrible to fix, but I don't know that I'll have time
>>> immediately.
>>>
>>
>> My thinking is that this should be handled by a copy, so that the
>> opcodes
>> always work on contiguous data. The copy can be another opcode. One
>> advantage
>> of operating on contiguous data is that it's easier to use the
>> processor's
>> vector instructions, if applicable.
>>
>
> That would be easy to do. Right now the opcodes should work correctly
> on data that is spaced in multiples of the itemsize on the last axis.
> Other arrays are copied (no opcode required, it's embedded at the top
> of interp_body lines 64-80). The record array case apparently slips
> through the cracks when we're checking whether an array is suitable to
> be used correctly (interpreter.c 1086-1103). It would certainly not be
> any harder to only allow contiguous arrays than to correctly deal with
> record arrays. Only question I have is whether the extra copy will
> overwhelm the savings of that operating on contiguous data gives. The
> thing to do is probably try it and see what happens.
OK, I've checked in a fix for this that makes a copy when the array is
not strided in an even multiple of the itemsize. I first tried copying
for all discontiguous array, but this resulted in a large speed hit for
vanilla strided arrays (a=arange(10)[::2], etc.), so I was more frugal
with my copying. I'm not entirely certain that I caught all of the
problematic cases, so let me know if you run into any more issues like this.
-tim
More information about the NumPy-Discussion
mailing list