Hi,

So, I'm working on a radiosity renderer, and it's basically finished.  I'm now trying to optimize it.  Currently, by far the most computationally expensive operation is visibility testing, where pixels are counted by the type of patch that was drawn on them.  Here's my current code, which I'm fairly sure can be optimized:

line 1) pixel_data = np.cast['int16'](255.0*glReadPixels(0,0,patch_view.width,patch_view.height,GL_RGB,GL_FLOAT)).reshape((-1,3))
line 2) pixel_data[:,1] *= 255
line 3) pixel_data[:,0] *= (255*255)
line 4) pixel_data = np.sum(pixel_data,1)
line 5) for pixel in pixel_data:
line 6)    if pixel != 0:
line 7)        try:    self.visibility[poly1num][pixel-1]
line 8)        except: self.visibility[poly1num][pixel-1] = 0
line 9)        self.visibility[poly1num][pixel-1] += 1

The basic idea:
Convert unique an array of RGB triplets into and array of unique numbers.  The triplet [0,0,1] maps to 1 and the triplet [255,255,255] maps to 16581375.  The triplet [0,0,0], mapping to 0, is always discarded.  The resulting array of integers in the range [1,16581375] are the (index+1)'s of a list, self.visibility[poly1num].  For each integer "i" in the array, the value in self.visibility[poly1num][i-1] must be incremented by 1. 

By line:
line 1: convert the read pixels from the range [0.0,1.0] (as floats) to the range [0,255] (as integers).  Change the shape from (width,height,3) to (width*height,3).
lines 2, 3, and 4: the pixels are an index to patches.  The green channel counts 255 times as much.  The red, 255 times as much as the green.  This way, 255^3 possible pixel types can be had.  Sum the red, green, and blue channels to arrive at a single number, which is the patch index in the range [0,255^3].
line 5: for each index in the data
line 6: I'm reserving the index [0,0,0] for no patch, so if the index is that color, no patch was drawn. 
lines 7, 8, and 9: increment self.visibility[poly1num][pixel-1] by 1. 

I have a feeling that this code is likely the bottleneck.  It's taking about an hour to execute this process 8720 times when width and height are both 512.  I have a feeling my code can be greatly improved in many ways--if not all over, certainly in lines 5 through 9. 

Help?

Thanks,
Ian