counting non-zero entries in an ndarray
What is the most efficient way to do the Matlab equivalent of nnz(M) (nnz = number-of-non-zeros function)? I've tried Google, but no luck. My assumption is that something like a != 0 will be used, but I'm not sure then how to "count" the number of "True" entries. TIA. Ian
On Wednesday, December 22, 2010 07:16:17 am Ian Stokes-Rees wrote:
What is the most efficient way to do the Matlab equivalent of nnz(M) (nnz = number-of-non-zeros function)?
I've tried Google, but no luck.
My assumption is that something like
a != 0
will be used, but I'm not sure then how to "count" the number of "True" entries.
TIA.
Ian
one possibility: len(where(a != 0)[0]) -- Thomas K. Gamble Research Technologist, System/Network Administrator Chemical Diagnostics and Engineering (C-CDE) Los Alamos National Laboratory MS-E543,p:505-665-4323 f:505-665-4267 There cannot be a crisis next week. My schedule is already full. Henry Kissinger
To answer the part about the most efficient way to do that,
In [1]: a = array([0,1,4,76,3,0,4,67,9,5,3,9,0,5,23,3,0,5,3,3,0,5,0])
In [8]: %timeit len(where(a!=0)[0])
100000 loops, best of 3: 6.54 us per loop
In [9]: %timeit (a!=0).sum()
100000 loops, best of 3: 9.81 us per loop
Seems like the where option is faster.
Now I create a large array
In [13]: a = hstack([a,a,a,a,a,a,a,a,a,a,a,a])
In [14]: %timeit len(where(a!=0)[0])
100000 loops, best of 3: 12.3 us per loop
In [15]: %timeit (a!=0).sum()
100000 loops, best of 3: 11 us per loop
Now the fastest way is using the sum. The where function is not vectorized
because it doesn't know in advance the size of the final array. In the case
of a big array, there will be a lot of copy in the memory, as it grows. And
the difference increases fast...
In [20]: a = hstack([a,a,a,a,a,a,a,a,a,a,a,a])
In [21]: %timeit len(where(a!=0)[0])
10000 loops, best of 3: 79.1 us per loop
In [22]: %timeit (a!=0).sum()
10000 loops, best of 3: 24.5 us per loop
Regards,
Jonathan
On Wed, Dec 22, 2010 at 11:43 AM, Thomas K Gamble
On Wednesday, December 22, 2010 07:16:17 am Ian Stokes-Rees wrote:
What is the most efficient way to do the Matlab equivalent of nnz(M) (nnz = number-of-non-zeros function)?
I've tried Google, but no luck.
My assumption is that something like
a != 0
will be used, but I'm not sure then how to "count" the number of "True" entries.
TIA.
Ian
one possibility:
len(where(a != 0)[0])
-- Thomas K. Gamble Research Technologist, System/Network Administrator Chemical Diagnostics and Engineering (C-CDE) Los Alamos National Laboratory MS-E543,p:505-665-4323 f:505-665-4267
There cannot be a crisis next week. My schedule is already full. Henry Kissinger _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
-- Jonathan Rocher, Enthought, Inc. jrocher@enthought.com 1-512-536-1057 http://www.enthought.com
On 12/22/10 9:16 AM, Ian Stokes-Rees wrote:
What is the most efficient way to do the Matlab equivalent of nnz(M) (nnz = number-of-non-zeros function)?
Thanks to all the various responses. I should have mentioned that I'm using scipy.sparse, and lil_matrix objects have a method "getnnz()" which gives me the number I want. Ian
participants (4)
-
Alan G Isaac
-
Ian Stokes-Rees
-
Jonathan Rocher
-
Thomas K Gamble