Create a method to index N-dim tensors using 1D index #23992

### Proposed new feature or change: I work with geospatial data that requires tensors with many dimensions. One challenge I used to have was when I had to implement a `__getitem__` to access those tensors using a 1D index. One use case is machine learning, when one needs to feed models sequentially with sub-tensors of the original one. A simple example is a matrix of shape 2x2 that has the positions `(0, 0)`, `(0, 1)`, `(1, 0)`, `(1, 1)`, and I want to access it via indices `[0, 1, 2, 3]`. So, when I say `__getitem__(2)`, I want the position `(1, 0)`. The internals of numpy probably has such information because arrays are usually stored contiguously in memory, but I couldn't find a way to access this information. I came up with this function that calculates tensor indices given the 1D index and the shape of the tensor: ``` python from typing import Tuple import numpy as np def f(i: int, s: Tuple[int]) -> Tuple[int]: return (i,) if len(s) == 1 else f(i // s[-1], s[:-1]) + (i % s[-1],) ``` How to use it: ``` python tensor = np.arange(12).reshape((3, 4)) row_id, col_id = f(i=6, s=(3, 4)) ``` Would numpy like to include this method in its code? If yes, I can check how to submit a PR. If not, I would appreciate an indication of a more suitable library (`scipy`, maybe `xarray`). Daniel

On Tue, Jun 20, 2023 at 11:38 AM Daniel Salles Civitarese < sallesd@br.ibm.com> wrote:
### Proposed new feature or change:
I work with geospatial data that requires tensors with many dimensions. One challenge I used to have was when I had to implement a `__getitem__` to access those tensors using a 1D index. One use case is machine learning, when one needs to feed models sequentially with sub-tensors of the original one. A simple example is a matrix of shape 2x2 that has the positions `(0, 0)`, `(0, 1)`, `(1, 0)`, `(1, 1)`, and I want to access it via indices `[0, 1, 2, 3]`. So, when I say `__getitem__(2)`, I want the position `(1, 0)`.
If the only reason you want to compute the indices is to access the data using those flattened indices, we use the `.flat` property instead and avoid explicitly computing those indices. ``` [~] |21> x = np.arange(2*3).reshape((2, 3)) [~] |22> x array([[0, 1, 2], [3, 4, 5]]) [~] |23> x.flat[[0, 1, 2, 3]] array([0, 1, 2, 3]) ``` -- Robert Kern

On Tue, 2023-06-20 at 12:07 -0400, Robert Kern wrote:
On Tue, Jun 20, 2023 at 11:38 AM Daniel Salles Civitarese < sallesd@br.ibm.com> wrote:
### Proposed new feature or change:
I work with geospatial data that requires tensors with many dimensions. One challenge I used to have was when I had to implement a `__getitem__` to access those tensors using a 1D index. One use case is machine learning, when one needs to feed models sequentially with sub-tensors of the original one. A simple example is a matrix of shape 2x2 that has the positions `(0, 0)`, `(0, 1)`, `(1, 0)`, `(1, 1)`, and I want to access it via indices `[0, 1, 2, 3]`. So, when I say `__getitem__(2)`, I want the position `(1, 0)`.
If the only reason you want to compute the indices is to access the data using those flattened indices, we use the `.flat` property instead and avoid explicitly computing those indices.
We also do have `np.unravel_indices`, which does exactly the thing requested I believe. - Sebsatian
``` [~]
21> x = np.arange(2*3).reshape((2, 3))
[~]
22> x array([[0, 1, 2], [3, 4, 5]])
[~]
23> x.flat[[0, 1, 2, 3]] array([0, 1, 2, 3])
_______________________________________________ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-leave@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: sebastian@sipsolutions.net
participants (3)
-
Daniel Salles Civitarese
-
Robert Kern
-
Sebastian Berg