subsampling array without loops

I'm looking for a way to acccomplish the following task without lots of loops involved, which are really slowing down my code. I have a 128x512 array which I want to break down into 2x2 squares. Then, for each 2x2 square I want to do some simple calculations such as finding the maximum value, which I then store in a 64x256 array. Here is the actual code involved. It's only slightly more complex than what I described above, since it also involves doing a bit of masking on the 2x2 sub-arrays. Any hints are much appreciated! An inordinate amount of time is being spent in this function and another one like it. Catherine def calc_sdcm_at_rlra(self,iblock): npixl = self.nlineht/self.nlinerl npixs = self.nsmpht/self.nsmprl sdcm_out = numpy.array([Constants.CLOUDMASK_NR] *self.nlinerl*self.nsmprl,'int8') \ .reshape(self.nlinerl,self.nsmprl) for ilinerl in range(0,self.nlinerl): for ismprl in range(0,self.nsmprl): height = self.data[iblock].height[ilinerl*2:ilinerl*2 +2, ismprl*2:ismprl*2+2] sdcm = self.data[iblock].sdcm[ilinerl*2:ilinerl*2 +2, ismprl*2:ismprl*2+2] source = self.data[iblock].heightsrc [ilinerl*2:ilinerl*2+2, ismprl*2:ismprl*2+2] mask1 = (source == Constants.HEIGHT_STEREO) mask2 = ( (source == Constants.HEIGHT_SURFACE) | \ (source == Constants.HEIGHT_DEFAULT) ) if (mask1.any()): loc = height[mask1].argmax() sdcm_out[ilinerl,ismprl] = sdcm[mask1].ravel()[loc] elif (mask2.any()): loc = height[mask2].argmax() sdcm_out[ilinerl,ismprl] = sdcm[mask2].ravel()[loc] return sdcm_out

I'm looking for a way to acccomplish the following task without lots of loops involved, which are really slowing down my code.
I have a 128x512 array which I want to break down into 2x2 squares. Then, for each 2x2 square I want to do some simple calculations such as finding the maximum value, which I then store in a 64x256 array.
Perhaps you could take the relevant slices of the arrays with appropriate striding: a = numpy.arange(128*512).reshape((128, 512)) top_left = a[::2, ::2] top_right = a[::2, 1::2] bottom_left = a[1::2, ::2] bottom_right = a[1::2, 1::2] tiles = numpy.array([top_left, top_right, bottom_left, bottom_right]) maxes = tiles.max(axis=0) Similarly if you want overlapping tiles, you could leave out the final :2 in the slice specifications above. Zach
Here is the actual code involved. It's only slightly more complex than what I described above, since it also involves doing a bit of masking on the 2x2 sub-arrays.
Any hints are much appreciated! An inordinate amount of time is being spent in this function and another one like it.
Catherine
def calc_sdcm_at_rlra(self,iblock):
npixl = self.nlineht/self.nlinerl npixs = self.nsmpht/self.nsmprl
sdcm_out = numpy.array([Constants.CLOUDMASK_NR] *self.nlinerl*self.nsmprl,'int8') \ .reshape(self.nlinerl,self.nsmprl)
for ilinerl in range(0,self.nlinerl): for ismprl in range(0,self.nsmprl):
height = self.data[iblock].height[ilinerl*2:ilinerl*2 +2, ismprl*2:ismprl*2+2] sdcm = self.data[iblock].sdcm[ilinerl*2:ilinerl*2 +2, ismprl*2:ismprl*2+2] source = self.data[iblock].heightsrc [ilinerl*2:ilinerl*2+2, ismprl*2:ismprl*2+2]
mask1 = (source == Constants.HEIGHT_STEREO) mask2 = ( (source == Constants.HEIGHT_SURFACE) | \ (source == Constants.HEIGHT_DEFAULT) )
if (mask1.any()): loc = height[mask1].argmax() sdcm_out[ilinerl,ismprl] = sdcm[mask1].ravel() [loc] elif (mask2.any()): loc = height[mask2].argmax() sdcm_out[ilinerl,ismprl] = sdcm[mask2].ravel() [loc]
return sdcm_out
_______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion

2008/8/22 Catherine Moroney <Catherine.M.Moroney@jpl.nasa.gov>:
I'm looking for a way to acccomplish the following task without lots of loops involved, which are really slowing down my code.
I have a 128x512 array which I want to break down into 2x2 squares. Then, for each 2x2 square I want to do some simple calculations such as finding the maximum value, which I then store in a 64x256 array.
You should be able to do some of this with reshape and transpose: In [1]: import numpy as np In [3]: A = np.zeros((128,512)) In [4]: B = np.reshape(A,(64,2,256,2)) In [5]: C = np.transpose(B,(0,2,1,3)) In [6]: C.shape Out[6]: (64, 256, 2, 2) Now C[i,j] is the 2 by 2 array [ [A[2*i,2*j], A[2*i,2*j+1]], [A[2*i+1,2*j], A[2*i+1, 2*j+1]] ] (or maybe I got those indices the wrong way around.) Now most operations you want to do on the two-by-two matrices can be done, using ufuncs, on the whole array at once. For example if you wanted the max of each two-by-two matrix, you'd write: In [7]: D = np.amax(np.amax(C,axis=-1),axis=-1) In [8]: D.shape Out[8]: (64, 256) Anne

2008/8/26 Anne Archibald <peridot.faceted@gmail.com>:
2008/8/22 Catherine Moroney <Catherine.M.Moroney@jpl.nasa.gov>:
I'm looking for a way to acccomplish the following task without lots of loops involved, which are really slowing down my code.
I have a 128x512 array which I want to break down into 2x2 squares. Then, for each 2x2 square I want to do some simple calculations such as finding the maximum value, which I then store in a 64x256 array.
You should be able to do some of this with reshape and transpose: In [1]: import numpy as np
In [3]: A = np.zeros((128,512))
In [4]: B = np.reshape(A,(64,2,256,2))
In [5]: C = np.transpose(B,(0,2,1,3))
In [6]: C.shape Out[6]: (64, 256, 2, 2)
Or you can obtain a similar effect using my new favorite hammer: from numpy.lib import stride_tricks rows, cols = x.shape el = x.dtype.itemsize y = stride_tricks.as_strided(x, shape=(rows/2, cols/2, 2, 2), strides=(el*2*cols, el*2, el*cols, el)) Cheers Stéfan
participants (4)
-
Anne Archibald
-
Catherine Moroney
-
Stéfan van der Walt
-
Zachary Pincus