[Numpy-discussion] subsampling arrays without loops

Moroney, Catherine M (398D) Catherine.M.Moroney at jpl.nasa.gov
Thu Oct 3 19:05:53 EDT 2013


I know I have a lot yet to learn about array striding tricks, so please
pardon the triviality of this question.

Here is the problem both in words and "dumb" python:

I have a large NxM array that I want to break down into smaller nxn chunks
where n divides evenly into both N and M.  Then I want to calculate the
fraction of pixels in each nxn chunk that meets a certain criteria: say (x > 1) & (x < 2).

Here is the "dumb" python code:

npix = 4
num_true = numpy.zeros((data.shape[0]/npix, data.shape[1]/npix))

for iline in xrange(0, data.shape[0]/npix):
   for ismp in xrange(0, data.shape[1]/npix):
       excerpt = data[iline*npix:(iline+1)*npix, ismp*npix:(ismp+1)*npix] 
       num_true[iline,simp] = numpy.where( (excerpt >= 1) & (excerpt <= 2), True, False).sum()

so I'm looping through the number of 4x4 subsets in both dimensions, cutting out a chunk
of the data ("excerpt"), and then counting the number of pixels in that excerpt that
meet a certain criteria and storing that result for each excerpt.

I want to avoid all the loops over iline and ismp.  What's the best way of doing this
in pure python?  I could always write a Fortran/C routine for this task, but I want to
learn how best to do it with numpy.  

Thank you for any advice,

Catherine




More information about the NumPy-Discussion mailing list