[Numpy-discussion] counting non-zero entries in an ndarray

Jonathan Rocher jrocher at enthought.com
Wed Dec 22 15:29:54 EST 2010


To answer the part about the most efficient way to do that,

In [1]: a = array([0,1,4,76,3,0,4,67,9,5,3,9,0,5,23,3,0,5,3,3,0,5,0])

In [8]: %timeit len(where(a!=0)[0])
100000 loops, best of 3: 6.54 us per loop

In [9]: %timeit (a!=0).sum()
100000 loops, best of 3: 9.81 us per loop

Seems like the where option is faster.

Now I create a large array
In [13]: a = hstack([a,a,a,a,a,a,a,a,a,a,a,a])

In [14]: %timeit len(where(a!=0)[0])
100000 loops, best of 3: 12.3 us per loop

In [15]: %timeit (a!=0).sum()
100000 loops, best of 3: 11 us per loop

Now the fastest way is using the sum. The where function is not vectorized
because it doesn't know in advance the size of the final array. In the case
of a big array, there will be a lot of copy in the memory, as it grows. And
the difference increases fast...

In [20]: a = hstack([a,a,a,a,a,a,a,a,a,a,a,a])

In [21]: %timeit len(where(a!=0)[0])
10000 loops, best of 3: 79.1 us per loop

In [22]: %timeit (a!=0).sum()
10000 loops, best of 3: 24.5 us per loop

Regards,
Jonathan

On Wed, Dec 22, 2010 at 11:43 AM, Thomas K Gamble <tkg at lanl.gov> wrote:

> On Wednesday, December 22, 2010 07:16:17 am Ian Stokes-Rees wrote:
> > What is the most efficient way to do the Matlab equivalent of nnz(M)
> > (nnz = number-of-non-zeros function)?
> >
> > I've tried Google, but no luck.
> >
> > My assumption is that something like
> >
> > a != 0
> >
> > will be used, but I'm not sure then how to "count" the number of "True"
> > entries.
> >
> > TIA.
> >
> > Ian
>
> one possibility:
>
> len(where(a != 0)[0])
>
> --
> Thomas K. Gamble
> Research Technologist, System/Network Administrator
> Chemical Diagnostics and Engineering (C-CDE)
> Los Alamos National Laboratory
> MS-E543,p:505-665-4323 f:505-665-4267
>
> There cannot be a crisis next week. My schedule is already full.
>    Henry Kissinger
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>



-- 
Jonathan Rocher,
Enthought, Inc.
jrocher at enthought.com
1-512-536-1057
http://www.enthought.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20101222/ba0869a6/attachment.html>


More information about the NumPy-Discussion mailing list