[Numpy-discussion] non-linear array manipulation

Wed Aug 13 21:55:19 EDT 2008

On Tue, 12 Aug 2008 10:37:51 -0400, Gong, Shawn (Contractor) wrote:
> The following array manipulation takes long time because I can't find
> ways to do in row/column, and have to do cell by cell.  Would you check
> to see if there is a nicer/faster way for this non-linear operation?
> 
> for i in range(rows):
>   for j in range(columns):
> 	a[i][j] = math.sqrt( max(0.0, a[i][j]*a[i][j] - b[j]*c[j]) )

In order to figure out how to do things like this efficiently, I like to 
write out the mathematical formula in subscript-summation notation first:

a_ij = sqrt( max(0.0, a_ij - b_j*c_j) )

Now this could be done on an element-wise basis if you redefine b and c 
as matrices:
  B_ij = b_j     and     C_ij = c_j
  => a_ij = sqrt( max(0.0, a_ij - B_ij*C_ij) )

Fortunately, with NumPy this is easy and doesn't require any data copying 
or extra memory use:

B = b[newaxis, :]
C = c[newaxis, :]
a = sqrt(maximum(0.0, a-B*C))

That's your solution.  It's a standard application of the broadcasting 
technique, which is crucial for time- and memory- efficient array-based 
algorithms.  It is explained in detail in the NumPy tutorial.

Hope that helps,
 Dan