From a 1d array, I want two arrays of indexes: the first for elements that satisfy a criterion, and the second for elements that do not. Naturally there are many ways to do this. Is there a preferred way? As a simple example, suppose for array `a` I want np.flatnonzero(a>0) and np.flatnonzero(a<=0). Can I get them both in one go? Thanks, Alan Isaac
On Sat, Apr 12, 2014 at 4:47 PM, Alan G Isaac
As a simple example, suppose for array `a` I want np.flatnonzero(a>0) and np.flatnonzero(a<=0). Can I get them both in one go?
I don't think you can do better than x = a > 0 p, q = np.flatnonzero(x), np.flatnonzero(~x)
On Sa, 2014-04-12 at 16:47 -0400, Alan G Isaac wrote:
From a 1d array, I want two arrays of indexes: the first for elements that satisfy a criterion, and the second for elements that do not. Naturally there are many ways to do this. Is there a preferred way?
As a simple example, suppose for array `a` I want np.flatnonzero(a>0) and np.flatnonzero(a<=0). Can I get them both in one go?
Might be missing something, but I don't think there is a way to do it in one go. The result is irregularly structured and there are few functions like nonzero which give something like that. - Sebastian
Thanks, Alan Isaac _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
On Sat, Apr 12, 2014 at 5:03 PM, Sebastian Berg
As a simple example, suppose for array `a` I want np.flatnonzero(a>0) and np.flatnonzero(a<=0). Can I get them both in one go?
Might be missing something, but I don't think there is a way to do it in one go. The result is irregularly structured and there are few functions like nonzero which give something like that.
The "set routines" [1] are in this category and may help you deal with partitions, but I would recommend using boolean arrays instead. If you commonly deal with both a subset and a complement, set representation does not give you a memory advantage over a boolean mask. [1] http://docs.scipy.org/doc/numpy/reference/routines.set.html
On 4/12/2014 5:20 PM, Alexander Belopolsky wrote:
The "set routines" [1] are in this category and may help you deal with partitions, but I would recommend using boolean arrays instead. If you commonly deal with both a subset and a complement, set representation does not give you a memory advantage over a boolean mask.
I take it that by a lack of a memory advantage you mean because boolean arrays are 8 bit representations. That makes sense. I find it rather more convenient to use boolean arrays, but I wonder if arrays of indexes might have other advantages (which would suggest using the set operations instead). In particular, might a[boolean_array] be slower that a[indexes]? (I'm just asking, not suggesting.) Thanks! Alan
On 14 April 2014 18:17, Alan G Isaac
I find it rather more convenient to use boolean arrays, but I wonder if arrays of indexes might have other advantages (which would suggest using the set operations instead). In particular, might a[boolean_array] be slower that a[indexes]? (I'm just asking, not suggesting.)
Indexing is generally faster, but convert from boolean to indexes gets more expensive: In [2]: arr =np.random.random(1000) In [3]: mask = arr>0.7 In [4]: mask.sum() Out[4]: 290 In [5]: %timeit arr[mask] 100000 loops, best of 3: 4.01 µs per loop In [6]: %%timeit ...: wh = np.where(mask) ...: arr[wh] ...: 100000 loops, best of 3: 6.47 µs per loop In [8]: wh = np.where(mask) In [9]: %timeit arr[wh] 100000 loops, best of 3: 2.57 µs per loop In [10]: %timeit np.where(mask) 100000 loops, best of 3: 3.89 µs per loop In [14]: np.all(arr[wh] == arr[mask]) Out[14]: True If you want to apply the same mask to several arrays, it is then worth (performance-wise) to do it. /David.
participants (4)
-
Alan G Isaac
-
Alexander Belopolsky
-
Daπid
-
Sebastian Berg