A minor clarification no why count_nonzero is faster for boolean arrays

I was just playing with `count_nonzero` and found it to be significantly faster for boolean arrays compared to integer arrays >>> a = np.random.randint(0, 2, (100, 5)) >>> a_bool = a.astype(bool) >>> %timeit np.sum(a) 100000 loops, best of 3: 5.64 µs per loop >>> %timeit np.count_nonzero(a) 1000000 loops, best of 3: 1.42 us per loop >>> %timeit np.count_nonzero(a_bool) 1000000 loops, best of 3: 279 ns per loop (but why?) I tried looking into the code and dug my way through to this line <https://github.com/numpy/numpy/blob/c0e48cfbbdef9cca954b0c4edd0052e1ec8a30aa...>. I am unable to dig further. I know this is probably a trivial question, but was wondering if anyone could provide insight on why this is so? Thanks R

I believe this line is the reason: https://github.com/numpy/numpy/blob/c0e48cfbbdef9cca954b0c4edd0052e1ec8a30aa... On Thu, Dec 17, 2015 at 11:52 AM, Raghav R V <ragvrv@gmail.com> wrote:

Would it make sense to at all to bring that optimization to np.sum()? I know that I have np.sum() all over the place instead of count_nonzero, partly because it is a MatLab-ism and partly because it is easier to write. I had no clue that there was a performance difference. Cheers! Ben Root On Thu, Dec 17, 2015 at 1:37 PM, CJ Carey <perimosocordiae@gmail.com> wrote:

On Thu, Dec 17, 2015 at 7:37 PM, CJ Carey <perimosocordiae@gmail.com> wrote:
I believe this line is the reason:
https://github.com/numpy/numpy/blob/c0e48cfbbdef9cca954b0c4edd0052e1ec8a30aa...
The magic actually happens in count_nonzero_bytes_384, a few lines before that (line 1986). Jaime -- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes de dominación mundial.

I believe this line is the reason: https://github.com/numpy/numpy/blob/c0e48cfbbdef9cca954b0c4edd0052e1ec8a30aa... On Thu, Dec 17, 2015 at 11:52 AM, Raghav R V <ragvrv@gmail.com> wrote:

Would it make sense to at all to bring that optimization to np.sum()? I know that I have np.sum() all over the place instead of count_nonzero, partly because it is a MatLab-ism and partly because it is easier to write. I had no clue that there was a performance difference. Cheers! Ben Root On Thu, Dec 17, 2015 at 1:37 PM, CJ Carey <perimosocordiae@gmail.com> wrote:

On Thu, Dec 17, 2015 at 7:37 PM, CJ Carey <perimosocordiae@gmail.com> wrote:
I believe this line is the reason:
https://github.com/numpy/numpy/blob/c0e48cfbbdef9cca954b0c4edd0052e1ec8a30aa...
The magic actually happens in count_nonzero_bytes_384, a few lines before that (line 1986). Jaime -- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes de dominación mundial.
participants (4)
-
Benjamin Root
-
CJ Carey
-
Jaime Fernández del Río
-
Raghav R V