[Numpy-discussion] Re: Add to NumPy a function to compute cumulative sums from 0.

Aug. 19, 2023

      Unfortunately, I don’t have a good answer.

For now, I can only tell you what I think might benefit from improvement.

1. Verbosity. I appreciate that bracket syntax such as one in julia or matlab `[A B C ...]` is not possible, so functional is the only option. E.g. julia has functions named ‘cat’, ‘vcat’, ‘hcat’, ‘vhcat’. I myself have recently redefined np.concatenate to `np_c`. For simple operations, it would surely be nice to have methods. E.g. `arr.append(axis)/arr.prepend(axis)`.

2. Excessive number of functions. There seems to be very many functions for concatenating and stacking. Many operations can be done using different functions and approaches and usually one of them is several times faster than the rest. I will give an example. Stacking two 1d vectors as columns of 2d array:

arr = np.arange(100)
TIMER.repeat([
    lambda: np.array([arr, arr]).T,
    lambda: np.vstack([arr, arr]).T,
    lambda: np.stack([arr, arr]).T,
    lambda: np.c_[arr, arr],
    lambda: np.column_stack((arr, arr)),
    lambda: np.concatenate([arr[:, None], arr[:, None]], axis=1)
]).print(3)
# mean [[0.012 0.044 0.052 0.13  0.032 0.024]]
Instead, having fewer, but more intuitive/flexible and well optimised functions would be a bit more convenient.

3. Flattening and reshaping API is not very intuitive. e.g. torch flatten is an example of a function which has a desired level of flexibility in contrast to `np.flatten`. https://pytorch.org/docs/stable/generated/torch.flatten.html <https://pytorch.org/docs/stable/generated/torch.flatten.html>. I had similar issues with multidimensional searching, sorting, multi-dimensional overlaps and custom unique functions. In other words, all functionality is there already, but in more custom (although requirement is often very simple from perspective of how it looks in my mind) multi-dimensional cases, there is no easy API and I end up writing my own numpy functions and benchmarking numerous ways to achieve the same thing. By now, I have my own multi-dimensional unique, sort, search, flatten, more flexible ix_, which are not well tested, but already more convenient, flexible and often several times faster than numpy ones (although all they do is reuse existing numpy functionality).

I think these are more along the lines of numpy 2.0, rather than simple extension. It feels that API can generally be more flexible and intuitive and there is enough of existing numpy material and external examples from which to draw from to make next level API happen. Although I appreciate required effort and difficulties.

Having all that said, implementing julia’s equivalents ‘cat’, ‘vcat’, ‘hcat’, ‘vhcat’ together with `arr.append(others, axis), arr.prepend(others, axis)` while ensuring that they use most optimised approaches could potentially make life easier for the time being.

—Nothing ever dies, just enters the state of deferred evaluation—
Dg
...
On 19 Aug 2023, at 17:39, Ronald van Elburg <r.a.j.van.elburg@hetnet.nl> wrote:
I think ultimately the copy is unnecessary.
That being said introducing prepend and append functions concentrates the complexity of the mapping in one place. Trying to avoid the extra copy would probably lead to a more complex implementation of accumulate.
How would in your view the prepend interface differ from concatenation or stacking?
_______________________________________________
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-leave@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: dom.grigonis@gmail.com

[Numpy-discussion] Re: Add to NumPy a function to compute cumulative sums from 0.

Dom Grigonis