[Numpy-discussion] Fast histogram

Zachary Pincus zachary.pincus at yale.edu
Thu Apr 17 12:02:55 EDT 2008


Hi folks,

I'm working on a live-video display for some microscope control tools  
I'm building. For this, I need a  fast histogram function to work on  
large-ish images (1000x2000 or so) at video rate, with cycles left  
over for more interesting calculations (like autofocus).

Now, numpy.histogram is a bit slower than I'd like, probably because  
it's pretty general (and of course cf. the recent discussion about its  
speed). I just need even bins within a set range. This is easy enough  
to do with a C-extension, or perhaps even cython, but before I go  
there, I was wondering if there's a numpy function that can help.

Here's what I have in mind:

def histogram(arr, bins, range):
   min, max = range
   indices = numpy.clip(((arr.astype(float) - min) * bins / (max -  
min)).astype(int), 0, bins-1)
   histogram = numpy.zeros(bins, numpy.uint32)
   for i in indices:
     histogram[i] += 1

Now, clearly, the last loop is what needs speeding up. Are there any  
numpy functions that can do this kind of operation? Also perhaps  
unnecessarily slow is the conversion of 'arr' to a float -- I do this  
to avoid overflow issues with integer arrays.

Any advice? Should I go ahead and write this up in C (easy enough), or  
can I do this in numpy? Probably the indices-computation line I'll  
speed up with numexpr, if I use a pure-numpy approach.

Thanks,
Zach



More information about the NumPy-Discussion mailing list