Short circuiting the all() and any() methods/functions
It has come to my attention that the all() and any() methods/functions do not short circuit. It takes nearly as much time to call any() on an array which has 1 as the first entry as it does to call it on an array of the same size full of zeros.
The cause of the problem is that all() and any() just call reduce() with the appropriate operator. Is anyone opposed to changing the implementations of these functions so that they shortcircuit?
By the way, Python already short circuits all() and any() correctly so it certainly makes sense to enact this change.
I'm willing to head this up if there isn't any opposition to it.
Justin Peel
On Mon, Dec 20, 2010 at 1:25 PM, Justin Peel jpscipy@gmail.com wrote:
It has come to my attention that the all() and any() methods/functions do not short circuit. It takes nearly as much time to call any() on an array which has 1 as the first entry as it does to call it on an array of the same size full of zeros.
The cause of the problem is that all() and any() just call reduce() with the appropriate operator. Is anyone opposed to changing the implementations of these functions so that they shortcircuit?
Recent version of reduce do short circuit. What version of numpy are you using?
Chuck
I'm using version 2.0.0.dev8716, which should be new enough I would think. Let me show you what makes me think that there isn't shortcircuiting going on.
I'll do two timeit's from the command line:
$ python m timeit s 'import numpy as np; x = np.ones(200000)' 'x.all()' 100 loops, best of 3: 3.87 msec per loop $ python m timeit s 'import numpy as np; x = np.ones(200000); x[0] = 0' 'x.all()' 100 loops, best of 3: 2.76 msec per loop
You can try different sizes for the arrays if you like, but the ratio of the times seems to hold pretty well. I would think that the second statement would be much, much faster than the first. Instead, it is only about 29% faster. I'm guessing that this speed isn't so much from shortcircuiting as that the logical AND operator is faster when the first argument is 0 (the second argument doesn't need to be checked). What do you think?
On Mon, Dec 20, 2010 at 2:12 PM, Charles R Harris charlesr.harris@gmail.com wrote:
On Mon, Dec 20, 2010 at 1:25 PM, Justin Peel jpscipy@gmail.com wrote:
It has come to my attention that the all() and any() methods/functions do not short circuit. It takes nearly as much time to call any() on an array which has 1 as the first entry as it does to call it on an array of the same size full of zeros.
The cause of the problem is that all() and any() just call reduce() with the appropriate operator. Is anyone opposed to changing the implementations of these functions so that they shortcircuit?
Recent version of reduce do short circuit. What version of numpy are you using?
Chuck
NumPyDiscussion mailing list NumPyDiscussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpydiscussion
ma, 20101220 kello 15:32 0700, Justin Peel kirjoitti:
I'm using version 2.0.0.dev8716, which should be new enough I would think. Let me show you what makes me think that there isn't shortcircuiting going on.
I'll do two timeit's from the command line:
$ python m timeit s 'import numpy as np; x = np.ones(200000)' 'x.all()' 100 loops, best of 3: 3.87 msec per loop $ python m timeit s 'import numpy as np; x = np.ones(200000); x[0] = 0' 'x.all()' 100 loops, best of 3: 2.76 msec per loop
The shortcircuit is made only for bool arrays.
$ python m timeit s 'import numpy as np; x = np.ones(200000, dtype=bool)' 'x.all()' 1000 loops, best of 3: 779 usec per loop $ python m timeit s 'import numpy as np; x = np.ones(200000, dtype=bool); x[0] = 0' 'x.all()' 100000 loops, best of 3: 3.12 usec per loop
Could be easily generalized to all types, though, apart from maybe handling the thruth value of NaN correctly.
participants (3)

Charles R Harris

Justin Peel

Pauli Virtanen