[Numpy-discussion] ndrange, like range but multidimensiontal

Mark Harfouche mark.harfouche at gmail.com
Mon Oct 8 12:21:40 EDT 2018


Allan,

Sorry for the delay. I had my mailing list preferences set to digest. I
changed them for now. (I hope this message continues that thread).

Thank you for your feedback. You are correct in identifying that the real
feature is expanding the `ndindex` API to support slicing. See comments
about the separate points you raised below

## Expanding the API of ndindex

> Better rather to add the slicing functionality to ndindex, than create a
whole new nearly-identical function.

This is a very important point. I should have included a note about it. My
[first attempt](
https://github.com/hmaarrfk/numpy/pull/1/files#diff-1bd953557a98073031ce66d05dbde3c8R663)
did try that approach.
I ran into 2 issues:
1. Getting around the catch-all positional argument is annoying, and logic
to do that will likely be error prone. Peculiarities about how we implement
it might cause some very strange for `tuple-like` inputs that we don't
expect.
2. `ndindex` is an iterator itself. As proposed, `ndrange`, like `range`,
is not an iterator. Changing this behaviour would likely lead to breaking
code that uses that assumption. For example anybody using introspection or
code like:

```
indx = np.ndindex(5, 5)
next(indx)  # Don't look at the (0, 0) coordinate
for i in indx:
    print(i)
```
would break if `ndindex` becomes "not an iterator"

For these two reasons, I thought it was easier to simply have a new class,
that seems like a close sibling to `ndindex`.

I personally don't care about point 1 so much. In my mind, start, stop and
step is confusing in ND. but maybe some might find it useful? Point 1 also
makes it harder to make `ndrange` more familiar to `range` users.

> I don't like adding more obscure functions

Hopefully the name `ndrange` makes it easier to find?

## Writing vectorized code

> np.ndindex is  already a somewhat obscure and discouraged method since it
is usually better to find a vectorized numpy operation instead of a for loop

I understand that this kind of function is not focused on `numerical`
operations on the elements of the matrix itself. It really is there to help
fill the void of any useful multi-dimensional python container.

I think `ndrange`/`ndindex` is there to be used like `np.vectorized`. I've
tried to use `np.vectorize` in my own code, but quickly found that making
logic fit into vectorize's requirements was often more complicated than
writing my own loop multi-nested loops. In my opinion, nested `range` loops
or `ndrange`/`ndindex` is a much more natural way to loop over collections
compared to `np.vectorized`.

I'm glad to add warnings to the docs.

## Implementation detail: itertools.product + range vs nditer

> Maybe there is a way to use  the same optimization tricks as in the
current implementation of ndindex  but allow different stop/step?

My primary goal here is to make `ndrange` behave much like `range`. By
implementing it on top of `range`, it makes it obvious to me how to enforce
that behaviour as the API of range gets expanded (though it seems to have
settled since Python 3.3). Whatever we decide to call `ndrange`/`ndindex`,
the tests I wrote can help ensure we have good range-API coverage (for now).

itertools.product + range seems to be much faster than the current
implementation of ndindex

(python 3.6)
```
%%timeit

for i in np.ndindex(100, 100):
    pass
3.94 ms ± 19.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

%%timeit
import itertools
for i in itertools.product(range(100), range(100)):
    pass
231 µs ± 1.09 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
```
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20181008/9b3117db/attachment.html>


More information about the NumPy-Discussion mailing list