Dom Grigonis wrote:

1. Dimension length stays constant, while cumusm0 extends length to n+1, then np.diff, truncates it back. This adds extra complexity, while things are very convenient to work with when dimension length stays constant throughout the code.

For n values there are n-1 differences. Equivalently, for k differences there are k+1 values. Herefor, `diff` ought to reduce length by 1 and `cumsum` ought to increase it by 1. Returning arrays of the same length is a fencepost error. This is a problem in the current behaviour of `cumsum` and the proposed behaviour of `diff0`. Dom Grigonis wrote:

For now, I only see my point of view and I can list a number of cases from data analysis and modelling, where I found np.diff0 to be a fairly optimal choice to use and it made things smoother. While I haven’t seen any real-life examples where np.cumsum0 would be useful so I am naturally biased. I would appreciate If anyone provided some examples that justify np.cumsum0 - for now I just can’t think of any case where this could actually be useful or why it would be more convenient/sensible than np.diff0.

------------------------------------------------------------ EXAMPLE Consider a path given by a list of points, say (101, 203), (102, 205), (107, 204) and (109, 202). What are the positions at fractions, say 1/3 and 2/3, along the path (linearly interpolating)? The problem is naturally solved with `diff` and `cumsum0`: ``` import numpy as np from scipy import interpolate positions = np.array([[101, 203], [102, 205], [107, 204], [109, 202]], dtype=float) steps_2d = np.diff(positions, axis=0) steps_1d = np.linalg.norm(steps_2d, axis=1) distances = np.cumsum0(steps_1d) fractions = distances / distances[-1] interpolate_at = interpolate.make_interp_spline(fractions, positions, 1) interpolate_at(1/3) interpolate_at(2/3) ``` Please show how to solve the problem with `diff0` and `cumsum`. ------------------------------------------------------------ Both `diff0` and `cumsum` have a fencepost problem, but `diff0` has a second defect: it maps an array of positions to a heterogeneous array where one element is a position and the rest are displacements. The operations that make sense for displacements, like scaling, differ from those that make sense for positions. ------------------------------------------------------------ EXAMPLE Money is invested on 2023-01-01. The annualized rate is 4% until 2023-02-04 and 5% thence until 2023-04-02. By how much does the money multiply in this time? The problem is naturally solved with `diff`: ``` import numpy as np percents = np.array([4, 5], dtype=float) times = np.array(["2023-01-01", "2023-02-04", "2023-04-02"], dtype=np.datetime64) durations = np.diff(times) YEAR = np.timedelta64(365, "D") multipliers = (1 + percents / 100) ** (durations / YEAR) multipliers.prod() ``` Please show how to solve the problem with `diff0`. It makes sense to divide `np.diff(times)` by `YEAR`, but it would not make sense to divide the output of `np.diff0(times)` by `YEAR` because of its incongruous initial value. ------------------------------------------------------------