[Numpy-discussion] fast_any_all , a trivial but fast/useful helper function for numpy

Wed Sep 11 09:20:58 EDT 2013

>From my previous mail: 

>> this has the same performance as your code:
>> a = empty([3] list(A.shape)

For anyone that is interested. I ran a benchmark on the code after Julian kindly provided me with a correction to the listing he posted.

>> a = empty([3] + list(A.shape))
>> a[0] = A>5; a[1] = B<2; a[2] = A>10;
>> np.any(a, 0)

Julian also suggested trying the idiom "np.vstack([A,B,C])" instead of [A,B,C].

Revised benchmarks here. I've moved the [A>5, B<2, A>10] creation outside the timing loop in all cases since it was distorting results due to array creation, which shouldn't be part of the any() timing measurement. I'm also now using separate test arrays to avoid the possibility of side effects between tests of different functions. 

The following results are produced consistently: 

np.any() -> 2.68 s
np.any() with Julian's first idiom above: -> 0.24s  
faa.any() (original version) -> 0.2s
np.any() with vstack(): 0.14s
faa.any() with vstack: 0.1s
faa.any() without vstack: 0.08s
(alternative faa implementations: 0.11-0.12s)

Conclusion:

fast_any_all is 30x faster than numpy.any() 1.7

fast_any_all is 43% faster than numpy.any() 1.7 with the vstack() idiom, which I understand to be the basis for the new approach in numpy.any() 1.8 development branch. 

I'd be really interested to see the benchmarks under the current 1.8 master branch of numpy. Please can someone try this and send me the file?

# git clone https://github.com/gbb/numpy-fast-any-all.git
(read the source code to make sure I'm not evil)
# cd numpy-fast-any-all
# python test_fast_any_all.py > BENCHMARK.txt

Incidentally, this is an appropriate example of a case where a 'performance idiom' becomes a 'penalty idiom' unexpectedly when the underlying implementation changes (vstack). 

Thanks for your suggestions, Julian.

Graeme.