
What is a simple, efficient way to determine if all elements in an array (in my case, 1D) are equal? How about close?

On Mon, Mar 5, 2012 at 11:14 AM, Neal Becker <ndbecker2@gmail.com> wrote:
What is a simple, efficient way to determine if all elements in an array (in my case, 1D) are equal? How about close?
For the exactly equal case, how about: I[1] a = np.array([1,1,1,1]) I[2] np.unique(a).size O[2] 1 # All equal I[3] a = np.array([1,1,1,2]) I[4] np.unique(a).size O[4] 2 # All not equal

Keith Goodman wrote:
On Mon, Mar 5, 2012 at 11:14 AM, Neal Becker <ndbecker2@gmail.com> wrote:
What is a simple, efficient way to determine if all elements in an array (in my case, 1D) are equal? How about close?
For the exactly equal case, how about:
I[1] a = np.array([1,1,1,1]) I[2] np.unique(a).size O[2] 1 # All equal
I[3] a = np.array([1,1,1,2]) I[4] np.unique(a).size O[4] 2 # All not equal
I considered this - just not sure if it's the most efficient

On Mon, Mar 5, 2012 at 11:24 AM, Neal Becker <ndbecker2@gmail.com> wrote:
Keith Goodman wrote:
On Mon, Mar 5, 2012 at 11:14 AM, Neal Becker <ndbecker2@gmail.com> wrote:
What is a simple, efficient way to determine if all elements in an array (in my case, 1D) are equal? How about close?
For the exactly equal case, how about:
I[1] a = np.array([1,1,1,1]) I[2] np.unique(a).size O[2] 1 # All equal
I[3] a = np.array([1,1,1,2]) I[4] np.unique(a).size O[4] 2 # All not equal
I considered this - just not sure if it's the most efficient
Yeah, it is slow: I[1] a = np.ones(100000) I[2] timeit np.unique(a).size 1000 loops, best of 3: 1.56 ms per loop I[3] timeit (a == a[0]).all() 1000 loops, best of 3: 203 us per loop I think all() short-circuits for bool arrays: I[4] a[1] = 9 I[5] timeit (a == a[0]).all() 10000 loops, best of 3: 89 us per loop You could avoid making the bool array by writing a function in cython. It could grab the first array element and then return False as soon as it finds an element that is not equal to it. And you could check for closeness. Or: I[8] np.allclose(a, a[0]) O[8] False I[9] a = np.ones(100000) I[10] np.allclose(a, a[0]) O[10] True

On Mon, Mar 5, 2012 at 1:29 PM, Keith Goodman <kwgoodman@gmail.com> wrote:
I[8] np.allclose(a, a[0]) O[8] False I[9] a = np.ones(100000) I[10] np.allclose(a, a[0]) O[10] True
One disadvantage of using a[0] as a proxy is that the result depends on the ordering of a (a.max() - a.min()) < epsilon is an alternative that avoids this. Another good use case for a minmax func.

Le 5 mars 2012 14:29, Keith Goodman <kwgoodman@gmail.com> a écrit :
On Mon, Mar 5, 2012 at 11:24 AM, Neal Becker <ndbecker2@gmail.com> wrote:
Keith Goodman wrote:
On Mon, Mar 5, 2012 at 11:14 AM, Neal Becker <ndbecker2@gmail.com> wrote:
What is a simple, efficient way to determine if all elements in an array (in my case, 1D) are equal? How about close?
For the exactly equal case, how about:
I[1] a = np.array([1,1,1,1]) I[2] np.unique(a).size O[2] 1 # All equal
I[3] a = np.array([1,1,1,2]) I[4] np.unique(a).size O[4] 2 # All not equal
I considered this - just not sure if it's the most efficient
Yeah, it is slow:
I[1] a = np.ones(100000) I[2] timeit np.unique(a).size 1000 loops, best of 3: 1.56 ms per loop I[3] timeit (a == a[0]).all() 1000 loops, best of 3: 203 us per loop
I think all() short-circuits for bool arrays:
I[4] a[1] = 9 I[5] timeit (a == a[0]).all() 10000 loops, best of 3: 89 us per loop
You could avoid making the bool array by writing a function in cython. It could grab the first array element and then return False as soon as it finds an element that is not equal to it. And you could check for closeness.
Or:
I[8] np.allclose(a, a[0]) O[8] False I[9] a = np.ones(100000) I[10] np.allclose(a, a[0]) O[10] True
Looks like the following is even faster: np.max(a) == np.min(a) -=- Olivier

On Mon, Mar 5, 2012 at 2:33 PM, Olivier Delalleau <shish@keba.be> wrote:
Le 5 mars 2012 14:29, Keith Goodman <kwgoodman@gmail.com> a écrit :
On Mon, Mar 5, 2012 at 11:24 AM, Neal Becker <ndbecker2@gmail.com> wrote:
Keith Goodman wrote:
On Mon, Mar 5, 2012 at 11:14 AM, Neal Becker <ndbecker2@gmail.com> wrote:
What is a simple, efficient way to determine if all elements in an array (in my case, 1D) are equal? How about close?
For the exactly equal case, how about:
I[1] a = np.array([1,1,1,1]) I[2] np.unique(a).size O[2] 1 # All equal
I[3] a = np.array([1,1,1,2]) I[4] np.unique(a).size O[4] 2 # All not equal
I considered this - just not sure if it's the most efficient
Yeah, it is slow:
I[1] a = np.ones(100000) I[2] timeit np.unique(a).size 1000 loops, best of 3: 1.56 ms per loop I[3] timeit (a == a[0]).all() 1000 loops, best of 3: 203 us per loop
I think all() short-circuits for bool arrays:
I[4] a[1] = 9 I[5] timeit (a == a[0]).all() 10000 loops, best of 3: 89 us per loop
You could avoid making the bool array by writing a function in cython. It could grab the first array element and then return False as soon as it finds an element that is not equal to it. And you could check for closeness.
Or:
I[8] np.allclose(a, a[0]) O[8] False I[9] a = np.ones(100000) I[10] np.allclose(a, a[0]) O[10] True
Looks like the following is even faster: np.max(a) == np.min(a)
How about numpy.ptp, to follow this line? I would expect it's single pass, but wouldn't short circuit compared to cython of Keith Josef
-=- Olivier
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

On Mon, Mar 5, 2012 at 11:36 AM, <josef.pktd@gmail.com> wrote:
How about numpy.ptp, to follow this line? I would expect it's single pass, but wouldn't short circuit compared to cython of Keith
I[1] a = np.ones(100000) I[2] timeit (a == a[0]).all() 1000 loops, best of 3: 203 us per loop I[3] timeit a.min() == a.max() 10000 loops, best of 3: 106 us per loop I[4] timeit np.ptp(a) 10000 loops, best of 3: 106 us per loop I[5] a[1] = 9 I[6] timeit (a == a[0]).all() 10000 loops, best of 3: 89.7 us per loop I[7] timeit a.min() == a.max() 10000 loops, best of 3: 102 us per loop I[8] timeit np.ptp(a) 10000 loops, best of 3: 103 us per loop

On Mon, Mar 5, 2012 at 1:44 PM, Keith Goodman <kwgoodman@gmail.com> wrote:
On Mon, Mar 5, 2012 at 11:36 AM, <josef.pktd@gmail.com> wrote:
How about numpy.ptp, to follow this line? I would expect it's single pass, but wouldn't short circuit compared to cython of Keith
I[1] a = np.ones(100000) I[2] timeit (a == a[0]).all() 1000 loops, best of 3: 203 us per loop I[3] timeit a.min() == a.max() 10000 loops, best of 3: 106 us per loop I[4] timeit np.ptp(a) 10000 loops, best of 3: 106 us per loop
I[5] a[1] = 9 I[6] timeit (a == a[0]).all() 10000 loops, best of 3: 89.7 us per loop I[7] timeit a.min() == a.max() 10000 loops, best of 3: 102 us per loop I[8] timeit np.ptp(a) 10000 loops, best of 3: 103 us per loop
Another issue to watch out for is if the array is empty. Technically speaking, that should be True, but some of the solutions offered so far would fail in this case. Ben Root

On Mon, Mar 5, 2012 at 11:52 AM, Benjamin Root <ben.root@ou.edu> wrote:
Another issue to watch out for is if the array is empty. Technically speaking, that should be True, but some of the solutions offered so far would fail in this case.
Good point. For fun, here's the speed of a simple cython allclose: I[2] a = np.ones(100000) I[3] timeit a.min() == a.max() 10000 loops, best of 3: 106 us per loop I[4] timeit allequal(a) 10000 loops, best of 3: 68.9 us per loop I[5] a[1] = 9 I[6] timeit a.min() == a.max() 10000 loops, best of 3: 102 us per loop I[7] timeit allequal(a) 1000000 loops, best of 3: 269 ns per loop where @cython.boundscheck(False) @cython.wraparound(False) def allequal(np.ndarray[np.float64_t, ndim=1] a): cdef: np.float64_t a0 Py_ssize_t i, n=a.size a0 = a[0] for i in range(n): if a[i] != a0: return False return True

Keith Goodman wrote:
On Mon, Mar 5, 2012 at 11:52 AM, Benjamin Root <ben.root@ou.edu> wrote:
Another issue to watch out for is if the array is empty. Technically speaking, that should be True, but some of the solutions offered so far would fail in this case.
Good point.
For fun, here's the speed of a simple cython allclose:
I[2] a = np.ones(100000) I[3] timeit a.min() == a.max() 10000 loops, best of 3: 106 us per loop I[4] timeit allequal(a) 10000 loops, best of 3: 68.9 us per loop
I[5] a[1] = 9 I[6] timeit a.min() == a.max() 10000 loops, best of 3: 102 us per loop I[7] timeit allequal(a) 1000000 loops, best of 3: 269 ns per loop
where
@cython.boundscheck(False) @cython.wraparound(False) def allequal(np.ndarray[np.float64_t, ndim=1] a): cdef: np.float64_t a0 Py_ssize_t i, n=a.size a0 = a[0] for i in range(n): if a[i] != a0: return False return True
But doesn't this one fail on empty array?

On Mon, Mar 5, 2012 at 12:06 PM, Neal Becker <ndbecker2@gmail.com> wrote:
But doesn't this one fail on empty array?
Yes. I'm optimizing for fun, not for corner cases. This should work for size zero and NaNs: @cython.boundscheck(False) @cython.wraparound(False) def allequal(np.ndarray[np.float64_t, ndim=1] a): cdef: np.float64_t a0 Py_ssize_t i, n=a.size if n == 0: return False # Or would you like True? a0 = a[0] for i in range(n): if a[i] != a0: return False return True

On Mon, Mar 5, 2012 at 12:12 PM, Keith Goodman <kwgoodman@gmail.com> wrote:
On Mon, Mar 5, 2012 at 12:06 PM, Neal Becker <ndbecker2@gmail.com> wrote:
But doesn't this one fail on empty array?
Yes. I'm optimizing for fun, not for corner cases. This should work for size zero and NaNs:
@cython.boundscheck(False) @cython.wraparound(False) def allequal(np.ndarray[np.float64_t, ndim=1] a): cdef: np.float64_t a0 Py_ssize_t i, n=a.size if n == 0: return False # Or would you like True? a0 = a[0] for i in range(n): if a[i] != a0: return False return True
Sorry for all the posts. I'll go back to being quiet. Seems like np.allclose returns True for empty arrays: I[2] a = np.array([]) I[3] np.allclose(np.array([]), np.array([])) O[3] True The original allequal cython code did the same: I[4] allequal(a) O[4] True

Another issue to watch out for is if the array is empty. Technically speaking, that should be True, but some of the solutions offered so far would fail in this case.
Similarly, NaNs or Infs could cause problems: they should signal as False, but several of the solutions would return True. ~Brett

How about the following? exact: numpy.all(a == a[0]) inexact: numpy.allclose(a, a[0]) On Mar 5, 2012, at 2:19 PM, Keith Goodman wrote:
On Mon, Mar 5, 2012 at 11:14 AM, Neal Becker <ndbecker2@gmail.com> wrote:
What is a simple, efficient way to determine if all elements in an array (in my case, 1D) are equal? How about close?
For the exactly equal case, how about:
I[1] a = np.array([1,1,1,1]) I[2] np.unique(a).size O[2] 1 # All equal
I[3] a = np.array([1,1,1,2]) I[4] np.unique(a).size O[4] 2 # All not equal _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
participants (8)
-
Benjamin Root
-
Brett Olsen
-
John Hunter
-
josef.pktd@gmail.com
-
Keith Goodman
-
Neal Becker
-
Olivier Delalleau
-
Zachary Pincus