![](https://secure.gravatar.com/avatar/8a6447f6df923c9aaac74a3b47a1e81c.jpg?s=120&d=mm&r=g)
Happy new year everybody! I've been upgrading my code to start to support array indexing and in my tests I found something that was well documented, but surprising to me. I've tried to read through https://numpy.org/doc/stable/user/basics.indexing.html#combining-advanced-an... and even after multiple passes, I still find it very terse... Consider a mutli dimensional dataset: import numpy as np shape = (10, 20, 30) original = np.arange(np.prod(shape)).reshape(shape) Let's consider we want to collapse dim 0 to a single entry Let's consider we want a subset from dim 1, with a slice Let's consider that we want want 3 elements from dim 2 i = 2 j = slice(1, 6) k = slice(7, 10) out_basic = original[i, j, k] assert out_basic.shape == (5, 3) Now consider we want to provide freedom to have instead of a slice for k, an arbitrary "array" k = [7, 11, 13] out_array = original[i, j, k] assert out_array.shape == (5, 3), f"shape is actually {out_array.shape}" AssertionError: shape is actually (3, 5) To get the result "Mark expects", one has to do it in two steps integer_types = (int, np.integer) integer_indexes = ( i if isinstance(i, integer_types) else slice(None), j if isinstance(j, integer_types) else slice(None), k if isinstance(k, integer_types) else slice(None), ) non_integer_indexes = ( ((i,) if not isinstance(i, integer_types) else ()) + ((j,) if not isinstance(j, integer_types) else ()) + ((k,) if not isinstance(k, integer_types) else ()) ) out_double_indexed = original [integer_indexes][non_integer_indexes] assert out_double_indexed.shape == (5, 3), f"shape is actually {out_double_indexed.shape}" This is somewhat very surprising to me. I totally understand that things won't change in terms of this kind of indexing in numpy, but is there a way I can adjust my indexing strategy to regain the ability to slice into my array in a "single shot". The main usecase is for arrays that are truly huge, but chucked in ways where slicing into them can be quite efficient. This multi-dimensional imaging data. Each chunk is quite "huge" so this kind of metadata manipulation is worthwhile to avoid unecessary IO. Perhaps there is a "simple" distinction I am missing, for example using a tuple for k instead of a list???? Thanks for your input! Mark (I tried to keep my code copy pastable)
![](https://secure.gravatar.com/avatar/764323a14e554c97ab74177e0bce51d4.jpg?s=120&d=mm&r=g)
On Mon, Dec 30, 2024 at 10:28 AM Mark Harfouche via NumPy-Discussion < numpy-discussion@python.org> wrote:
No, there's no simple solution that you're missing. The kind of indexing that you want has been considered in NEP 21 (which called it "orthogonal indexing"), which saw some progress, but has largely been left fallow. There's been some movement on the Array API specification side, so that might spur some movement. https://numpy.org/neps/nep-0021-advanced-indexing.html Jaime Rio made a pure Python implementation that might work for you (though I'm not sure about the performance for large arrays with big slices), but it's buried in a close PR (still works, though): https://github.com/numpy/numpy/pull/5749/files
Note that some other array implementations like Xarray and Zarr provide the `.oindex` property which does the orthogonal indexing semantics (roughly) per NEP 21, so those might be options for you. -- Robert Kern
![](https://secure.gravatar.com/avatar/764323a14e554c97ab74177e0bce51d4.jpg?s=120&d=mm&r=g)
On Mon, Dec 30, 2024 at 10:28 AM Mark Harfouche via NumPy-Discussion < numpy-discussion@python.org> wrote:
No, there's no simple solution that you're missing. The kind of indexing that you want has been considered in NEP 21 (which called it "orthogonal indexing"), which saw some progress, but has largely been left fallow. There's been some movement on the Array API specification side, so that might spur some movement. https://numpy.org/neps/nep-0021-advanced-indexing.html Jaime Rio made a pure Python implementation that might work for you (though I'm not sure about the performance for large arrays with big slices), but it's buried in a close PR (still works, though): https://github.com/numpy/numpy/pull/5749/files
Note that some other array implementations like Xarray and Zarr provide the `.oindex` property which does the orthogonal indexing semantics (roughly) per NEP 21, so those might be options for you. -- Robert Kern
participants (2)
-
Mark Harfouche
-
Robert Kern