[Numpy-discussion] Fancy Indexing of Structured Arrays is Slow

Thu May 15 08:31:50 EDT 2014

As can be seen from the code below (or in the notebook linked beneath) fancy 
indexing of a structured array is twice as slow as indexing both fields 
independently - making it 4x slower?

I found that fancy indexing was a bottleneck in my application so I was 
hoping to reduce the overhead by combining the arrays into a structured 
array and only doing one indexing operation. Unfortunately that doubled the 
time that it took!

Is there any reason for this? If not, I'm happy to open an enhancement issue 
on GitHub - just let me know.

Thanks,
Dave

In [32]: nrows, ncols = 365, 10000

In [33]: items = np.rec.fromarrays(randn(2,nrows, ncols), names=
['widgets','gadgets'])

In [34]: row_idx = randint(0, nrows, ncols)
    ...: col_idx = np.arange(ncols)

In [35]: %timeit filtered_items = items[row_idx, col_idx]
100 loops, best of 3: 3.45 ms per loop

In [36]: %%timeit 
    ...: widgets = items['widgets'][row_idx, col_idx]
    ...: gadgets = items['gadgets'][row_idx, col_idx]
    ...: 
1000 loops, best of 3: 1.57 ms per loop

http://nbviewer.ipython.org/urls/gist.githubusercontent.com/dhirschfeld/98b9
970fb68adf23dfea/raw/10c0f968ea1489f0a24da80d3af30de7106848ac/Slow%20Structu
red%20Array%20Indexing.ipynb

https://gist.github.com/dhirschfeld/98b9970fb68adf23dfea