Implementing a "find first" style function
The ticket https://github.com/numpy/numpy/issues/2269 discusses the possibility of implementing a "find first" style function which can optimise the process of finding the first value(s) which match a predicate in a given 1D array. For example:
a = np.sin(np.linspace(0, np.pi, 200)) print find_first(a, lambda a: a > 0.9) ((71, ), 0.900479032457)
This has been discussed in several locations: https://github.com/numpy/numpy/issues/2269 https://github.com/numpy/numpy/issues/2333 http://stackoverflow.com/questions/7632963/numpy-array-how-to-find-index-of-... *Rationale* For small arrays there is no real reason to avoid doing:
a = np.sin(np.linspace(0, np.pi, 200)) ind = (a > 0.9).nonzero()[0][0] print (ind, ), a[ind] (71,) 0.900479032457
But for larger arrays, this can lead to massive amounts of work even if the result is one of the first to be computed. Example:
a = np.arange(1e8) print (a == 5).nonzero()[0][0] 5
So a function which terminates when the first matching value is found is desirable. As mentioned in #2269, it is possible to define a consistent ordering which allows this functionality for >1D arrays, but IMHO it overcomplicates the problem and was not a case that I personally needed, so I've limited the scope to 1D arrays only. *Implementation* My initial assumption was that to get any kind of performance I would need to write the *find* function in C, however after prototyping with some array chunking it became apparent that a trivial python function would be quick enough for my needs. The approach I've implemented in the code found in #2269 simply breaks the array into sub-arrays of maximum length *chunk_size* (2048 by default, though there is no real science to this number), applies the given predicating function, and yields the results from *nonzero()*. The given function should be a python function which operates on the whole of the sub-array element-wise (i.e. the function should be vectorized). Returning a generator also has the benefit of allowing users to get the first *n*matching values/indices. *Results* I timed the implementation of *find* found in my comment at https://github.com/numpy/numpy/issues/2269#issuecomment-14436725 with an obvious test: In [1]: from np_utils import find In [2]: import numpy as np In [3]: import numpy.random In [4]: np.random.seed(1) In [5]: a = np.random.randn(1e8) In [6]: a.min(), a.max() Out[6]: (-6.1194900990552776, 5.9632246301166321) In [7]: next(find(a, lambda a: np.abs(a) > 6)) Out[7]: ((33105441,), -6.1194900990552776) In [8]: (np.abs(a) > 6).nonzero() Out[8]: (array([33105441]),) In [9]: %timeit (np.abs(a) > 6).nonzero() 1 loops, best of 3: 1.51 s per loop In [10]: %timeit next(find(a, lambda a: np.abs(a) > 6)) 1 loops, best of 3: 912 ms per loop In [11]: %timeit next(find(a, lambda a: np.abs(a) > 6, chunk_size=100000)) 1 loops, best of 3: 470 ms per loop In [12]: %timeit next(find(a, lambda a: np.abs(a) > 6, chunk_size=1000000)) 1 loops, best of 3: 483 ms per loop This shows that picking a sensible *chunk_size* can yield massive speed-ups (nonzero is x3 slower in one case). A similar example with a much smaller 1D array shows similar promise: In [41]: a = np.random.randn(1e4) In [42]: %timeit next(find(a, lambda a: np.abs(a) > 3)) 10000 loops, best of 3: 35.8 us per loop In [43]: %timeit (np.abs(a) > 3).nonzero() 10000 loops, best of 3: 148 us per loop As I commented on the issue tracker, if you think this function is worth taking forward, I'd be happy to open up a pull request. Feedback greatfully received. Cheers, Phil
On Tue, Mar 5, 2013 at 9:15 AM, Phil Elson
The ticket https://github.com/numpy/numpy/issues/2269 discusses the possibility of implementing a "find first" style function which can optimise the process of finding the first value(s) which match a predicate in a given 1D array. For example:
a = np.sin(np.linspace(0, np.pi, 200)) print find_first(a, lambda a: a > 0.9) ((71, ), 0.900479032457)
This has been discussed in several locations:
https://github.com/numpy/numpy/issues/2269 https://github.com/numpy/numpy/issues/2333
http://stackoverflow.com/questions/7632963/numpy-array-how-to-find-index-of-...
*Rationale*
For small arrays there is no real reason to avoid doing:
a = np.sin(np.linspace(0, np.pi, 200)) ind = (a > 0.9).nonzero()[0][0] print (ind, ), a[ind] (71,) 0.900479032457
But for larger arrays, this can lead to massive amounts of work even if the result is one of the first to be computed. Example:
a = np.arange(1e8) print (a == 5).nonzero()[0][0] 5
So a function which terminates when the first matching value is found is desirable.
As mentioned in #2269, it is possible to define a consistent ordering which allows this functionality for >1D arrays, but IMHO it overcomplicates the problem and was not a case that I personally needed, so I've limited the scope to 1D arrays only.
*Implementation*
My initial assumption was that to get any kind of performance I would need to write the *find* function in C, however after prototyping with some array chunking it became apparent that a trivial python function would be quick enough for my needs.
The approach I've implemented in the code found in #2269 simply breaks the array into sub-arrays of maximum length *chunk_size* (2048 by default, though there is no real science to this number), applies the given predicating function, and yields the results from *nonzero()*. The given function should be a python function which operates on the whole of the sub-array element-wise (i.e. the function should be vectorized). Returning a generator also has the benefit of allowing users to get the first *n*matching values/indices.
*Results*
I timed the implementation of *find* found in my comment at https://github.com/numpy/numpy/issues/2269#issuecomment-14436725 with an obvious test:
In [1]: from np_utils import find
In [2]: import numpy as np
In [3]: import numpy.random
In [4]: np.random.seed(1)
In [5]: a = np.random.randn(1e8)
In [6]: a.min(), a.max() Out[6]: (-6.1194900990552776, 5.9632246301166321)
In [7]: next(find(a, lambda a: np.abs(a) > 6)) Out[7]: ((33105441,), -6.1194900990552776)
In [8]: (np.abs(a) > 6).nonzero() Out[8]: (array([33105441]),)
In [9]: %timeit (np.abs(a) > 6).nonzero() 1 loops, best of 3: 1.51 s per loop
In [10]: %timeit next(find(a, lambda a: np.abs(a) > 6)) 1 loops, best of 3: 912 ms per loop
In [11]: %timeit next(find(a, lambda a: np.abs(a) > 6, chunk_size=100000)) 1 loops, best of 3: 470 ms per loop
In [12]: %timeit next(find(a, lambda a: np.abs(a) > 6, chunk_size=1000000)) 1 loops, best of 3: 483 ms per loop
This shows that picking a sensible *chunk_size* can yield massive speed-ups (nonzero is x3 slower in one case). A similar example with a much smaller 1D array shows similar promise:
In [41]: a = np.random.randn(1e4)
In [42]: %timeit next(find(a, lambda a: np.abs(a) > 3)) 10000 loops, best of 3: 35.8 us per loop
In [43]: %timeit (np.abs(a) > 3).nonzero() 10000 loops, best of 3: 148 us per loop
As I commented on the issue tracker, if you think this function is worth taking forward, I'd be happy to open up a pull request.
Feedback greatfully received.
Cheers,
Phil
In the interest of generalizing code and such, could such approaches be used for functions like np.any() and np.all() for short-circuiting if True or False (respectively) are found? I wonder what other sort of functions in NumPy might benefit from this? Ben Root
Interesting. I hadn't thought of those. I've implemented (very roughly
without a sound logic check) and benchmarked:
def my_any(a, predicate, chunk_size=2048):
try:
next(find(a, predicate, chunk_size))
return True
except StopIteration:
return False
def my_all(a, predicate, chunk_size=2048):
return not my_any(a, lambda a: ~predicate(a), chunk_size)
With the following setup:
import numpy as np
import numpy.random
np.random.seed(1)
a = np.random.randn(1e8)
For a low frequency *any*:
In [12]: %timeit (np.abs(a) > 6).any()
1 loops, best of 3: 1.29 s per loop
In [13]: %timeit my_any(a, lambda a: np.abs(a) > 6)
1 loops, best of 3: 792 ms per loop
In [14]: %timeit my_any(a, lambda a: np.abs(a) > 6, chunk_size=10000)
1 loops, best of 3: 654 ms per loop
For a False *any*:
In [16]: %timeit (np.abs(a) > 7).any()
1 loops, best of 3: 1.22 s per loop
In [17]: %timeit my_any(a, lambda a: np.abs(a) > 7)
1 loops, best of 3: 2.4 s per loop
For a high probability *any*:
In [28]: %timeit (np.abs(a) > 1).any()
1 loops, best of 3: 972 ms per loop
In [27]: %timeit my_any(a, lambda a: np.abs(a) > 1)
10000 loops, best of 3: 67 us per loop
---------------
For a low probability *all*:
In [18]: %timeit (np.abs(a) < 6).all()
1 loops, best of 3: 1.16 s per loop
In [19]: %timeit my_all(a, lambda a: np.abs(a) < 6)
1 loops, best of 3: 880 ms per loop
In [20]: %timeit my_all(a, lambda a: np.abs(a) < 6, chunk_size=10000)
1 loops, best of 3: 706 ms per loop
For a True *all*:
In [22]: %timeit (np.abs(a) < 7).all()
1 loops, best of 3: 1.47 s per loop
In [23]: %timeit my_all(a, lambda a: np.abs(a) < 7)
1 loops, best of 3: 2.65 s per loop
For a high probability *all*:
In [25]: %timeit (np.abs(a) < 1).all()
1 loops, best of 3: 978 ms per loop
In [26]: %timeit my_all(a, lambda a: np.abs(a) < 1)
10000 loops, best of 3: 73.6 us per loop
On 6 March 2013 21:16, Benjamin Root
On Tue, Mar 5, 2013 at 9:15 AM, Phil Elson
wrote: The ticket https://github.com/numpy/numpy/issues/2269 discusses the possibility of implementing a "find first" style function which can optimise the process of finding the first value(s) which match a predicate in a given 1D array. For example:
a = np.sin(np.linspace(0, np.pi, 200)) print find_first(a, lambda a: a > 0.9) ((71, ), 0.900479032457)
This has been discussed in several locations:
https://github.com/numpy/numpy/issues/2269 https://github.com/numpy/numpy/issues/2333
http://stackoverflow.com/questions/7632963/numpy-array-how-to-find-index-of-...
*Rationale*
For small arrays there is no real reason to avoid doing:
a = np.sin(np.linspace(0, np.pi, 200)) ind = (a > 0.9).nonzero()[0][0] print (ind, ), a[ind] (71,) 0.900479032457
But for larger arrays, this can lead to massive amounts of work even if the result is one of the first to be computed. Example:
a = np.arange(1e8) print (a == 5).nonzero()[0][0] 5
So a function which terminates when the first matching value is found is desirable.
As mentioned in #2269, it is possible to define a consistent ordering which allows this functionality for >1D arrays, but IMHO it overcomplicates the problem and was not a case that I personally needed, so I've limited the scope to 1D arrays only.
*Implementation*
My initial assumption was that to get any kind of performance I would need to write the *find* function in C, however after prototyping with some array chunking it became apparent that a trivial python function would be quick enough for my needs.
The approach I've implemented in the code found in #2269 simply breaks the array into sub-arrays of maximum length *chunk_size* (2048 by default, though there is no real science to this number), applies the given predicating function, and yields the results from *nonzero()*. The given function should be a python function which operates on the whole of the sub-array element-wise (i.e. the function should be vectorized). Returning a generator also has the benefit of allowing users to get the first *n*matching values/indices.
*Results*
I timed the implementation of *find* found in my comment at https://github.com/numpy/numpy/issues/2269#issuecomment-14436725 with an obvious test:
In [1]: from np_utils import find
In [2]: import numpy as np
In [3]: import numpy.random
In [4]: np.random.seed(1)
In [5]: a = np.random.randn(1e8)
In [6]: a.min(), a.max() Out[6]: (-6.1194900990552776, 5.9632246301166321)
In [7]: next(find(a, lambda a: np.abs(a) > 6)) Out[7]: ((33105441,), -6.1194900990552776)
In [8]: (np.abs(a) > 6).nonzero() Out[8]: (array([33105441]),)
In [9]: %timeit (np.abs(a) > 6).nonzero() 1 loops, best of 3: 1.51 s per loop
In [10]: %timeit next(find(a, lambda a: np.abs(a) > 6)) 1 loops, best of 3: 912 ms per loop
In [11]: %timeit next(find(a, lambda a: np.abs(a) > 6, chunk_size=100000)) 1 loops, best of 3: 470 ms per loop
In [12]: %timeit next(find(a, lambda a: np.abs(a) > 6, chunk_size=1000000)) 1 loops, best of 3: 483 ms per loop
This shows that picking a sensible *chunk_size* can yield massive speed-ups (nonzero is x3 slower in one case). A similar example with a much smaller 1D array shows similar promise:
In [41]: a = np.random.randn(1e4)
In [42]: %timeit next(find(a, lambda a: np.abs(a) > 3)) 10000 loops, best of 3: 35.8 us per loop
In [43]: %timeit (np.abs(a) > 3).nonzero() 10000 loops, best of 3: 148 us per loop
As I commented on the issue tracker, if you think this function is worth taking forward, I'd be happy to open up a pull request.
Feedback greatfully received.
Cheers,
Phil
In the interest of generalizing code and such, could such approaches be used for functions like np.any() and np.all() for short-circuiting if True or False (respectively) are found? I wonder what other sort of functions in NumPy might benefit from this?
Ben Root
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Bump.
I'd be interested to know if this is a desirable feature for numpy?
(specifically the 1D "find" functionality rather than the "any"/"all" also
discussed)
If so, I'd be more than happy to submit a PR, but I don't want to put in
the effort if the principle isn't desirable in the core of numpy.
Cheers,
On 8 March 2013 17:38, Phil Elson
Interesting. I hadn't thought of those. I've implemented (very roughly without a sound logic check) and benchmarked:
def my_any(a, predicate, chunk_size=2048): try: next(find(a, predicate, chunk_size)) return True except StopIteration: return False
def my_all(a, predicate, chunk_size=2048): return not my_any(a, lambda a: ~predicate(a), chunk_size)
With the following setup:
import numpy as np import numpy.random
np.random.seed(1) a = np.random.randn(1e8)
For a low frequency *any*:
In [12]: %timeit (np.abs(a) > 6).any() 1 loops, best of 3: 1.29 s per loop
In [13]: %timeit my_any(a, lambda a: np.abs(a) > 6)
1 loops, best of 3: 792 ms per loop
In [14]: %timeit my_any(a, lambda a: np.abs(a) > 6, chunk_size=10000) 1 loops, best of 3: 654 ms per loop
For a False *any*:
In [16]: %timeit (np.abs(a) > 7).any() 1 loops, best of 3: 1.22 s per loop
In [17]: %timeit my_any(a, lambda a: np.abs(a) > 7) 1 loops, best of 3: 2.4 s per loop
For a high probability *any*:
In [28]: %timeit (np.abs(a) > 1).any() 1 loops, best of 3: 972 ms per loop
In [27]: %timeit my_any(a, lambda a: np.abs(a) > 1) 10000 loops, best of 3: 67 us per loop
---------------
For a low probability *all*:
In [18]: %timeit (np.abs(a) < 6).all() 1 loops, best of 3: 1.16 s per loop
In [19]: %timeit my_all(a, lambda a: np.abs(a) < 6) 1 loops, best of 3: 880 ms per loop
In [20]: %timeit my_all(a, lambda a: np.abs(a) < 6, chunk_size=10000) 1 loops, best of 3: 706 ms per loop
For a True *all*:
In [22]: %timeit (np.abs(a) < 7).all() 1 loops, best of 3: 1.47 s per loop
In [23]: %timeit my_all(a, lambda a: np.abs(a) < 7) 1 loops, best of 3: 2.65 s per loop
For a high probability *all*:
In [25]: %timeit (np.abs(a) < 1).all() 1 loops, best of 3: 978 ms per loop
In [26]: %timeit my_all(a, lambda a: np.abs(a) < 1) 10000 loops, best of 3: 73.6 us per loop
On 6 March 2013 21:16, Benjamin Root
wrote: On Tue, Mar 5, 2013 at 9:15 AM, Phil Elson
wrote: The ticket https://github.com/numpy/numpy/issues/2269 discusses the possibility of implementing a "find first" style function which can optimise the process of finding the first value(s) which match a predicate in a given 1D array. For example:
a = np.sin(np.linspace(0, np.pi, 200)) print find_first(a, lambda a: a > 0.9) ((71, ), 0.900479032457)
This has been discussed in several locations:
https://github.com/numpy/numpy/issues/2269 https://github.com/numpy/numpy/issues/2333
http://stackoverflow.com/questions/7632963/numpy-array-how-to-find-index-of-...
*Rationale*
For small arrays there is no real reason to avoid doing:
a = np.sin(np.linspace(0, np.pi, 200)) ind = (a > 0.9).nonzero()[0][0] print (ind, ), a[ind] (71,) 0.900479032457
But for larger arrays, this can lead to massive amounts of work even if the result is one of the first to be computed. Example:
a = np.arange(1e8) print (a == 5).nonzero()[0][0] 5
So a function which terminates when the first matching value is found is desirable.
As mentioned in #2269, it is possible to define a consistent ordering which allows this functionality for >1D arrays, but IMHO it overcomplicates the problem and was not a case that I personally needed, so I've limited the scope to 1D arrays only.
*Implementation*
My initial assumption was that to get any kind of performance I would need to write the *find* function in C, however after prototyping with some array chunking it became apparent that a trivial python function would be quick enough for my needs.
The approach I've implemented in the code found in #2269 simply breaks the array into sub-arrays of maximum length *chunk_size* (2048 by default, though there is no real science to this number), applies the given predicating function, and yields the results from *nonzero()*. The given function should be a python function which operates on the whole of the sub-array element-wise (i.e. the function should be vectorized). Returning a generator also has the benefit of allowing users to get the first *n* matching values/indices.
*Results*
I timed the implementation of *find* found in my comment at https://github.com/numpy/numpy/issues/2269#issuecomment-14436725 with an obvious test:
In [1]: from np_utils import find
In [2]: import numpy as np
In [3]: import numpy.random
In [4]: np.random.seed(1)
In [5]: a = np.random.randn(1e8)
In [6]: a.min(), a.max() Out[6]: (-6.1194900990552776, 5.9632246301166321)
In [7]: next(find(a, lambda a: np.abs(a) > 6)) Out[7]: ((33105441,), -6.1194900990552776)
In [8]: (np.abs(a) > 6).nonzero() Out[8]: (array([33105441]),)
In [9]: %timeit (np.abs(a) > 6).nonzero() 1 loops, best of 3: 1.51 s per loop
In [10]: %timeit next(find(a, lambda a: np.abs(a) > 6)) 1 loops, best of 3: 912 ms per loop
In [11]: %timeit next(find(a, lambda a: np.abs(a) > 6, chunk_size=100000)) 1 loops, best of 3: 470 ms per loop
In [12]: %timeit next(find(a, lambda a: np.abs(a) > 6, chunk_size=1000000)) 1 loops, best of 3: 483 ms per loop
This shows that picking a sensible *chunk_size* can yield massive speed-ups (nonzero is x3 slower in one case). A similar example with a much smaller 1D array shows similar promise:
In [41]: a = np.random.randn(1e4)
In [42]: %timeit next(find(a, lambda a: np.abs(a) > 3)) 10000 loops, best of 3: 35.8 us per loop
In [43]: %timeit (np.abs(a) > 3).nonzero() 10000 loops, best of 3: 148 us per loop
As I commented on the issue tracker, if you think this function is worth taking forward, I'd be happy to open up a pull request.
Feedback greatfully received.
Cheers,
Phil
In the interest of generalizing code and such, could such approaches be used for functions like np.any() and np.all() for short-circuiting if True or False (respectively) are found? I wonder what other sort of functions in NumPy might benefit from this?
Ben Root
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
On Tue, Mar 26, 2013 at 9:20 AM, Phil Elson
Bump.
I'd be interested to know if this is a desirable feature for numpy? (specifically the 1D "find" functionality rather than the "any"/"all" also discussed) If so, I'd be more than happy to submit a PR, but I don't want to put in the effort if the principle isn't desirable in the core of numpy.
I don't think anyone has a strong opinion either way :-). It seems like a fairly general interface that people might find useful, so I don't see an immediate objection to including it in principle. It would help to see the actual numbers from a tuned version though to know how much benefit there is to get... -n
I've specifically not tuned it, primarily because the get the best tuning
you need to know the likelihood of finding a match. One option would be to
allow users to specify a "probability" parameter which would chunk the
array into size*probability chunks - an additional parameter could then be
exposed to limit the maximum chunk size to give the user control of the
maximum memory overhead that the routine could use.
I'll submit a PR and we can discuss inline.
Thanks for the response Nathaniel.
On 27 March 2013 12:19, Nathaniel Smith
Bump.
I'd be interested to know if this is a desirable feature for numpy? (specifically the 1D "find" functionality rather than the "any"/"all" also discussed) If so, I'd be more than happy to submit a PR, but I don't want to put in
On Tue, Mar 26, 2013 at 9:20 AM, Phil Elson
wrote: the effort if the principle isn't desirable in the core of numpy.
I don't think anyone has a strong opinion either way :-). It seems like a fairly general interface that people might find useful, so I don't see an immediate objection to including it in principle. It would help to see the actual numbers from a tuned version though to know how much benefit there is to get...
-n _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
participants (3)
-
Benjamin Root
-
Nathaniel Smith
-
Phil Elson