
Okay! I think I now actually understand what was going on with np.gradient! The discussion has been pretty confused, and I'm worried that what's in master right now is not the right solution, which is a problem because we're supposed to cut 1.9.1 tomorrow. Background: np.gradient computes gradients using a finite difference method. Wikipedia has a good explanation of how this works: https://en.wikipedia.org/wiki/Numerical_differentiation The key point is that there are multiple formulas one can use that all converge to the derivative, e.g. (f(x + h) - f(x)) / h or (f(x + h) - f(x - h)) / 2h The first is the textbook definition of the derivative. As h -> 0, the error in that approximation shrinks like O(h). The second formula also converges to the derivative as h -> 0, but it converges like O(h^2), i.e., much faster. And there's are many many formulas like this with different trade-offs: https://en.wikipedia.org/wiki/Finite_difference_coefficient In practice, given a grid of values of f(x) and a grid stepsize of h, all of these formulas come down to doing a simple convolution of the data with certain coefficients (e.g. [-1/h, 1/h] or [-1/2h, 0, 1/2h] for the two formulas above). Traditionally np.gradient has used the second formula above, with its quadratically diminishing error term, for interior points in the grid. For points on the boundary of the grid, though, this formula has a problem, b/c it requires you look at points to both the left and the right of the current point -- but for boundary points one of these doesn't exist. In such situations np.gradient has traditionally used the first formula instead (the "forward (or backward) finite difference approximation with linear error", where "forward"/"backward" means that it works on the boundary). As the web page linked above shows, though, there's an easy alternative formula that works on The change in numpy 1.9.0 that has caused all the fuss, is that we switched to using the quadratically accurate approximation instead of the linearly accurate approximation for boundary points. This had two effects: 1) it produced (hopefully small) changes in the calculation for the boundaries. In general these should have made the calculations more accurate, but apparently this did cause problems for some people who wanted perfectly reproducible output. 2) it tickled a nasty bug in how gradient and masked arrays work together. It turns out gradient applied to masked arrays has always been seriously buggy; there's some horribleness in the masked array implementation that means the output array that np.gradient is writing into ends up sharing a mask array with the input. So as we go along calculating the output, we also mutate the input. This can definitely corrupt the input, for example here: https://gist.github.com/njsmith/551738469b74d175e039 ...and I strongly suspect it's possible to find cases where it means you get the wrong answer. For mysterious reasons involving the order in which values were calculated or something, the version of np.gradient in 1.9.0 tickled this bug in a much worse manner, with early mutations to the input array ended up affecting later calculations using the same input array, causing cascading errors. This is the cause of the massive problems that matplotlib encountered in their test suite: https://github.com/matplotlib/matplotlib/issues/3598#issuecomment-58859663 There's a proposed fix for the masked array issues at: https://github.com/numpy/numpy/pull/5203 For the reasons described above, I would be very wary about using *any* currently released version of np.gradient on masked arrays. A workaround would be to make a local copy of the definition of np.gradient, and immediately after the 'out' array is allocated do 'out.mask = out.mask.copy()'. Once this bug is fixed, the "new" gradient code produces results which are identical to the "old" gradient code on the interior; only the boundary is different. ------ So here are my concerns: - We decided to revert the changes to np.gradient in 1.9.1 (at least by default). I'm not sure how much of that decision was based on the issues with masked arrays, though, which turns out to be a different issue entirely. The previous discussion conflated the two issues above. - We decided to gate the more-accurate boundary calculation behind a kwarg called "edge_order=2". AFAICT now that I actually understand what this code is doing, this is a terrible name -- we're using a 3-coefficient kernel to compute a quadratically accurate approximation to a first-order derivative. There probably exist other kernels that are also quadratically accurate. "Order 2" simply doesn't refer to this in any unique or meaningful way. And it will be even more confusing if we ever add the option to select which kernel to use on the interior, where "quadratically accurate" is definitely not enough information to uniquely define a kernel. Maybe we want something like edge_accuracy=2 or edge_accuracy="quadratic" or something? I'm not sure. - If edge_order=2 escapes into a release tomorrow then we'll be stuck with it. Some possible options for 1.9.1: - Just revert the entire np.gradient code to match 1.8, and put off a real fix until 1.9. This at least avoids getting stuck with a poor API. - Put the masked array bug fix into 1.9.1, and leave the gradient code the same as 1.9.0; put off designing a proper API for selecting the old behavior for 1.9.2. (Or 1.10 I guess.) This doesn't solve the pure reproduceability problems, but it might be enough to make matplotlib work again at least? - Delay 1.9.1 to rehash the issues and make a decision on what we want to support long-term. -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org