[Numpy-discussion] Need help for implementing a fast clip in numpy (was slow clip)

Fri Jan 12 19:09:47 EST 2007

I think it may have all been cleared up now, but just in case:

 >>> a
array([[[ 0,  1,  2,  3],
         [ 4,  5,  6,  7],
         [ 8,  9, 10, 11]],

        [[12, 13, 14, 15],
         [16, 17, 18, 19],
         [20, 21, 22, 23]]])
 >>> a.flags
   C_CONTIGUOUS : True
   F_CONTIGUOUS : False
   OWNDATA : True
   WRITEABLE : True
   ALIGNED : True
   UPDATEIFCOPY : False

So here is the most common case: Contiguous, aligned, etc. This means 
the data buffer is a "normal C array", and you can iterate through it in 
the normal C way.

 >>> b = a[:,2,:]
 >>> b
array([[ 8,  9, 10, 11],
        [20, 21, 22, 23]])

So b is a slice pulled out of a, sharing the same data black, but not 
all of it!

 >>> b.flags
   C_CONTIGUOUS : False
   F_CONTIGUOUS : False
   OWNDATA : False
   WRITEABLE : True
   ALIGNED : True
   UPDATEIFCOPY : False
 >>>

So it is not longer a Contiguous array. This means that you are looking 
at the same block of memory as above, but only some of it is used for 
this array, and you need to use the strides to get at the elements that 
are part of this array.

This is all aside from alignment issues, of which I know nothing.

> So the data buffer in numpy is not really a C array, but more like a 
> structure ? (I am  talking about the actual binary data)

Is is a C array of bytes, and probably of the given type, but not all of 
it is used for this particular array object, or, in the case of Fortran 
ordering, it's used in a different order that a conventional C n-d array.

> "for a numpy array a of eg float32, am I guaranteed that 
> a->data[sizeof(float32) * i] for 0 <= i < a.size gives me all the items 
> of a, even for non contiguous arrays ?"

I think you got the answer to this, but in case you didn't:

No, only if it's contiguous (I'm not sure about alignment)

>> First, it's not that important if the array is contiguous for this 
>> > sort of thing. What you really care about is whether it's 
>> > simply-strided (or maybe single-strided would be a better term)

but how do you get single strided? this is what always made it hard for 
me to know how to write this kind of code.

>> > Anyway, the last dimension of the array can be strided without making 
>> > things more difficult. All you need to be able to do is to address the 
>> > elements of the array as thedata[offset + stride*index].

But how do you make it that simple for n-d arrays? that only works for 
1-d. What I have come up with are some macros for the common cases -- 
1-d, 2-d, 3-d:

#define ARRAYVAL2(aType,a,i,j) ( *(aType *)(a->data + 
(i)*a->strides[0] + (j)*a->strides[1]))	

#define ARRAYVAL3(aType,a,i,j,k) ( *(aType *)(a->data + 
(i)*a->strides[0] + (j)*a->strides[1] + (k)*a->strides[2]))	

(this was for Numeric -- maybe it's slightly different no)

This involves a fair bit of math for each index operation -- it could 
really slow down a simple function like clip.

To do this for the general case, the only thing I could come up with was 
recursion, which seemed a bit heavy-weight, so I went with looping 
through all the indices to compute the offset. Ugly and slow.

> Now, it is exposed in the concept of an array iterator.  Anybody 
> can take advantage of it as it there is a C-API call to get an array 
> iterator from the array

Is this iterator as efficient as incrementing the index for contiguous 
arrays? i.e. is there any point of special casing contiguous arrays if 
you use it?

-Chris

-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov