[Numpy-discussion] ENH: Proposal to add np.neighborwise in PR#9514

Joseph Fox-Rabinovitz jfoxrabinovitz at gmail.com
Fri Aug 4 18:31:52 EDT 2017


I would like to propose the addition of a new function,
`np.neighborwise` in PR#9514. It is based on the discussion relating
to my proposal for `np.ratio` (PR#9481) and Eric Wieser's
`np.neighborwise` in PR#9428. This function accepts an array `a`, a
vectorized function of two arguments `func`, and applies the function
to all of the neighboring elements of the array across multiple
dimensions. There are options for masking out parts of the calculation
and for applying the function recursively.

The name of the function is not written in stone. The current name is
taken directly from PR#9428 because I can not think of a better one.

This function can serve as a backend for the existing `np.diff`, which
has been re-implemented in this PR, as well as for the `ratio`
function I propsed earlier. This adds the diagonal diffs feature,
which is tested and backwards compatible. `ratio` can be implemented
very simply with or without a mask. With a mask, it can be expressed
`np.neighborwise(a, np.*_divide, axis=axis, n=n, mask=lambda *args:
args[1])` (The conversion to bool is done automatically).

The one potentially non-backwards-compatible API change that this PR
introduces is that `np.diff` now returns an `ndarray` version of the
input, instead of the original array itself if `n==0`. Previously, the
exact input reference was returned for `n==0`. I very seriously doubt
that this feature was ever used outside the numpy test suite anyway.
The advantage of this change is that an invalid axis input can now be
caught before returning the unaltered array. If this change is
considered too drastic, I can remove it without removing the axis
check.

The two main differences between this PR and PR#9428 are the addition
of masks to the computation, and the interpretation of multiple axes.
PR#9428 applies `func` successively along each axis. This provides no
way of doing diagonal diffs. I chose to shift along all the axes
simultaneously before applying `func`. To clarify with an example, if
we take `a=[[1, 2], [3, 4]]`, `axis=[0, 1]` and `func=np.subtract`,
PR#9428 would take two diffs, `(4 - 2) - (3 - 1) = 0`, while the
version I propose here just takes the diagonal diff `4 - 1 = 3`.
Besides being more intuitive in my opinion, taking diagonal diffs
actually adds a new feature that can not be obtained directly by
taking successive diffs.

Please let me know your thoughts.

Regards,

    -Joe


More information about the NumPy-Discussion mailing list