
I am trying to effieciently sum over a subset of the elements of a matrix. In Matlab, this could be done like: a=[1,2,3,4,5,6,7,8,9,10] b = [1,0,0,0,0,0,0,0,0,1] res=sum(a(b)) %this sums the elements of a which have corresponding elements in b that are true Is there anything similar in numarray (or numeric)? I thought masked arrays looked promising, but I find that masking 90% of the elements results in marginal speedups (~5%, instead of 90%) over the unmasked array. Thanks! Darren

On Wed, 2004-09-01 at 14:51, Darren Dale wrote:
I am trying to effieciently sum over a subset of the elements of a matrix. In Matlab, this could be done like: a=[1,2,3,4,5,6,7,8,9,10] b = [1,0,0,0,0,0,0,0,0,1] res=sum(a(b))
This needs to be sum(a(find(b)).
Is there anything similar in numarray (or numeric)? I thought masked arrays looked promising, but I find that masking 90% of the elements results in marginal speedups (~5%, instead of 90%) over the unmasked array.
I don't think that's bad, and in fact it is substantially better than MATLAB. Consider the following clip from MATLAB Version 7:
a=randn(10000000,1); t=cputime;sum(a);e=cputime()-t
e = 0.1300
f=rand(10000000,1)<0.1; t=cputime;sum(a(find(f)));e=cputime()-t
e = 0.2200 In other words, masking off all but 10% of the elements of a 1e7 element array actually increased the CPU time required for the sum by about 50%. In addition, I doubt you can measure CPU time for only a 10 element array. I had to use 1e7 elements in MATLAB on a 2.26MHz P4 just to get the CPU time large enough to measure reasonably accurately. Also recall that it is a known characteristic of numarray that it is slow on small arrays in general. -- Stephen Walton <stephen.walton@csun.edu> Dept. of Physics & Astronomy, Cal State Northridge

Stephen Walton wrote:
In addition, I doubt you can measure CPU time for only a 10 element array. I had to use 1e7 elements in MATLAB on a 2.26MHz P4 just to get the CPU time large enough to measure reasonably accurately. Also recall that it is a known characteristic of numarray that it is slow on small arrays in general.
Sorry, I was giving the 10 element example for clarity. I am actually using arrays with over 6e6 elements. I just discovered compress, it works wonders in my situation. The following script runs in 1 second on my 2GHz P4, winXP. The same calculation using a masked array took 18 seconds: from numarray import * from time import clock clock() Rx = ones((2500,2500))*12.5 N = zeros((2500,2500),typecode=Bool) N[:250,:]=1 trans = compress(N,Rx) temp = exp(2j*pi*(trans+trans))*exp(2j*pi*(trans)) s = sum(temp.real) print s, clock()

Em Qua, 2004-09-01 às 18:51, Darren Dale escreveu:
I am trying to effieciently sum over a subset of the elements of a matrix. In Matlab, this could be done like: a=[1,2,3,4,5,6,7,8,9,10] b = [1,0,0,0,0,0,0,0,0,1] res=sum(a(b)) %this sums the elements of a which have corresponding elements in b that are true
If the mask is of boolean type (not integer) you can use it just like in MATLAB:
from numarray import * import numarray.random_array as ra a = ra.random(1000000) sum(a) 500184.16988508566 b = ra.random(1000000) < 0.1 sum(a[b]) 50331.373006955822
This should work for numarray only. Paulo -- Paulo José da Silva e Silva Professor Assistente do Dep. de Ciência da Computação (Assistant Professor of the Computer Science Dept.) Universidade de São Paulo - Brazil e-mail: rsilva@ime.usp.br Web: http://www.ime.usp.br/~rsilva Teoria é o que não entendemos o (Theory is something we don't) suficiente para chamar de prática. (understand well enough to call practice)
participants (3)
-
Darren Dale
-
Paulo J. S. Silva
-
Stephen Walton