Re: [Numpy-discussion] padding options for diff
![](https://secure.gravatar.com/avatar/010a29141d99dad896beef506bfa3dca.jpg?s=120&d=mm&r=g)
It works for me. I can't *think* of a case where you could have a np.diff on a string array and 'first' could be confused with an element, since you're not allowed diff on strings in the present numpy anyway (unless wiser heads than me know something!). Feel free to move the conversation to github btw. Peter
![](https://secure.gravatar.com/avatar/851ff10fbb1363b7d6111ac60194cc1c.jpg?s=120&d=mm&r=g)
Matthew has made what looks like a very nice implementation of padding in np.diff in https://github.com/numpy/numpy/pull/8206. I raised two general questions about desired behaviour there that Matthew thought we should put out on the mailiing list as well. This indeed seemed a good opportunity to get feedback, so herewith a copy of https://github.com/numpy/numpy/pull/8206#issuecomment-256909027 -- Marten 1. I'm not sure that treating a 1-d array as something that will just extend the result along `axis` is a good idea, as it breaks standard broadcasting rules. E.g., consider ``` np.diff([[1, 2], [4, 8]], to_begin=[1, 4]) # with your PR: array([[1, 4, 1], [1, 4, 4]]) # but from regular broadcasting I would expect array([[1, 1], [4, 4]]) # i.e., the same as if I did to_begin=[[1, 4]] ``` I think it is slightly odd to break the broadcasting expectation here, especially since the regular use case surely is just to add a single element so that one keeps the original shape. The advantage of assuming this is that you do not have to do *any* array shaping of `to_begin` and `to_end` (which perhaps also suggests it is the right thing to do). 2. As I mentioned above, I think it may be worth thinking through a little what to do with higher order differences, at least for `to_begin='first'`. If the goal is to ensure that with that option, it becomes the inverse of `cumsum`, then I think for higher order one should add multiple elements in front, i.e., for that case, the recursive call should be ``` return np.diff(np.diff(a, to_begin='first'), n-1, to_begin='first') ```
![](https://secure.gravatar.com/avatar/851ff10fbb1363b7d6111ac60194cc1c.jpg?s=120&d=mm&r=g)
Matthew has made what looks like a very nice implementation of padding in np.diff in https://github.com/numpy/numpy/pull/8206. I raised two general questions about desired behaviour there that Matthew thought we should put out on the mailiing list as well. This indeed seemed a good opportunity to get feedback, so herewith a copy of https://github.com/numpy/numpy/pull/8206#issuecomment-256909027 -- Marten 1. I'm not sure that treating a 1-d array as something that will just extend the result along `axis` is a good idea, as it breaks standard broadcasting rules. E.g., consider ``` np.diff([[1, 2], [4, 8]], to_begin=[1, 4]) # with your PR: array([[1, 4, 1], [1, 4, 4]]) # but from regular broadcasting I would expect array([[1, 1], [4, 4]]) # i.e., the same as if I did to_begin=[[1, 4]] ``` I think it is slightly odd to break the broadcasting expectation here, especially since the regular use case surely is just to add a single element so that one keeps the original shape. The advantage of assuming this is that you do not have to do *any* array shaping of `to_begin` and `to_end` (which perhaps also suggests it is the right thing to do). 2. As I mentioned above, I think it may be worth thinking through a little what to do with higher order differences, at least for `to_begin='first'`. If the goal is to ensure that with that option, it becomes the inverse of `cumsum`, then I think for higher order one should add multiple elements in front, i.e., for that case, the recursive call should be ``` return np.diff(np.diff(a, to_begin='first'), n-1, to_begin='first') ```
participants (2)
-
Marten van Kerkwijk
-
Peter Creasey