[Numpy-discussion] Need help for implementing a fast clip in numpy (was slow clip)

David Cournapeau david at ar.media.kyoto-u.ac.jp
Fri Jan 12 00:08:29 EST 2007

Christopher Barker wrote:
>> autogen  works well enough for me;
> I didn't know about autogen -- that may be all we need.
numpy has code which already does something similar to autogen: you 
declare a function, and some template with a generic name, and the code 
generator replaces the generic name and type with some values. All the 
.src files in numpy/core follow this pattern.
>> Now, I didn't know that clip was supposed to handle arrays as min/max 
>> values.
> one more nifty feature...And if you want to support broadcasting, even 
> more so!
>> At first, I didn't understand the need to care about 
>> contiguous/non contiguous; having non scalar for min/max makes it 
>> necessary to have special case for non contiguous.
> I'm confused. This issue is that you can't just increment the pointer to 
> get the next element if the array is non-contiguous.. you need to do all 
> the strides, etc, math.
Ok, so we don't mean the same thing by contiguous, and I should check 
that my definition is the actual one... For me, contiguous means that 
the array has C order, and a non contiguous array has a 'random' order, 
but still can go to the next element in the buffer by using standard C 
array addressing. In my mind, contiguous is about the relationship 
between the indexing of the array in C and the math indexing.

According to the numpy ebook, the data buffer may:
- not be aligned on word boundaries -> NPY_ALIGNED
- not be native endianess -> NPY_ISNOTSWAPPED
- not C contiguous (last index does not move first) -> NPY_CONTIGUOUS.

I thought that as long as NPY_ALIGNED is true, you are sure that 
array->data[i] is the ith element of the buffer with the datatype of the 
array ?

If the data are not aligned or not native endian, I just use the 
existing implementation; if you are not using the CPU endianness or 
alignment, you cannot expect to do things at a decent speed anyway.

In my code, I differentiate alignment, endianness and scalar case. If 
any of this condition is not true, I just rely on the old implementation 
for now, which should make it easy to extend if necessary.



More information about the NumPy-Discussion mailing list