[Numpy-discussion] calculating weighted majority using two 3D arrays

Thu Mar 6 01:34:24 EST 2008

Gregory, Matthew wrote:
> Hi list,
> 
> I'm a definite newbie to numpy, but finding the library to be incredibly
> useful.
> 
> I'm trying to calculate a weighted majority using numpy functions.  I
> have two sets of image stacks (one is values, the other weights) that I
> read into 3D numpy arrays.  Assuming I read in a 100 row x 100 col image
> subset consisting of ten images each, I have two arrays called values
> and weights with the following shape:
> 
> values.shape = (10, 100, 100)
> weights.shape = (10, 100, 100)

You may need to be a bit more specific by what you mean by weighted 
majority. What are the range of values for values and weights, 
specifically? This sounds a lot like pixel classification where each 
pixel is classified with a majority vote over its weights and values. Is 
that what you're trying to do?

Many numpy functions (e.g. mean, max, min, sum) have an axis parameter, 
which specifies the axis along which the statistic is computed. Omitting 
the axis parameter causes the statistic to be computed over all values 
in the multidimensional array.

Suppose the 'values' array contains floating point numbers in the range 
-1 to 1 and a larger absolute value gives a larger confidence. Also 
suppose the weights are floating point numbers between 0 and 1. The 
weighted majority vote for pixel i,j over 10 real-valued (confidenced) 
votes, each vote having a separate weight, is computed by

   w_vote = numpy.sign((values[:,i,j]*weights[:,i,j]).sum())

This can be vectorized to give a weighted majority vote for each pixel 
by doing

   w_vote = numpy.sign((values*weights).sum(axis=0))

The values*weights expression gives a weighted prediction. This also 
works if the 'values' are just predictions from the set {-1, 1}, i.e. 
there are ten classifiers, each one predicts either -1 and 1 on each pixel.

I hope this helps.

Damian

> At this point I need to call my user-defined function to calculate the
> weighted majority which should return a value for each 'pixel' in my 100
> x 100 subset.  The way I'm doing it now (which I assume is NOT optimal)
> is to pass values[:,i,j] and weights[:,i,j] to my function in a double
> loop for i rows and j columns.  I then build up the return values into a
> subsequent 2D array.
> 
> It seems like I should be able to use vectorize() or apply_along_axis()
> to do this, but I'm not clever enough to figure this out.
> Alternatively, should I be structuring my initial data differently so
> that it's easier to use one of these functions.  The only way I can
> think about doing that would be to store the two 10-item arrays into a
> tuple and then make an array of these tuples, but that seemed overly
> complicated.  Or potentially, is there a way to calculate a weighted
> majority just using standard numpy functions??
> 
> Thanks for any suggestions,
> matt
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion