[Numpy-discussion] index partition

Tue Apr 15 04:34:48 EDT 2014

On 14 April 2014 18:17, Alan G Isaac <alan.isaac at gmail.com> wrote:

> I find it rather more convenient to use boolean arrays,
> but I wonder if arrays of indexes might have other
> advantages (which would suggest using the set operations
> instead). In particular, might a[boolean_array] be slower
> that a[indexes]?  (I'm just asking, not suggesting.)

Indexing is generally faster, but convert from boolean to indexes gets more
expensive:

In [2]: arr =np.random.random(1000)

In [3]: mask = arr>0.7

In [4]: mask.sum()
Out[4]: 290

In [5]: %timeit arr[mask]
100000 loops, best of 3: 4.01 µs per loop

In [6]: %%timeit
   ...: wh = np.where(mask)
   ...: arr[wh]
   ...:
100000 loops, best of 3: 6.47 µs per loop

In [8]: wh = np.where(mask)

In [9]: %timeit arr[wh]
100000 loops, best of 3: 2.57 µs per loop

In [10]: %timeit np.where(mask)
100000 loops, best of 3: 3.89 µs per loop

In [14]: np.all(arr[wh] == arr[mask])
Out[14]: True

If you want to apply the same mask to several arrays, it is then worth
(performance-wise) to do it.

/David.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20140415/bc66947b/attachment.html>