
On Thu, Jun 30, 2022 at 10:56 PM Warren Weckesser < warren.weckesser@gmail.com> wrote:
A function to get the minimum and maximum values of an array simultaneously could be very useful, from both a convenience and performance point of view. Especially when arrays get larger the performance benefit could be significant, and even more if the array doesn't fit in L2/L3 cache or even memory.
There are many cases where not either the minimum or the maximum of an array is required, but both. Think of clipping an array, getting it's range, checking for outliers, normalizing, making a plot like a histogram, etc.
This function could be called aminmax() for example, and also be called
On 6/30/22, Ewout ter Hoeven <e.m.terhoeven@student.tudelft.nl> wrote: like
ndarray.minmax(). It should return a tuple (min, max) with the minimum and maximum values of the array, identical to calling (ndarray.min(), ndarray.max()).
With such a function, numpy.ptp() and the special cases of numpy.quantile(a, q=[0,1]) and numpy.percentile(a, q=[0,100]) could also potentially be speeded up, among others.
Potentially argmin and argmax could get the same treatment, being called argminmax().
There is also a very extensive post on Stack Overflow (a bit old already) with discussion and benchmarks:
https://stackoverflow.com/questions/12200580/numpy-function-for-simultaneous...
FYI, I have a fairly simple gufunc implementation of `minmax` in ufunclab (https://github.com/WarrenWeckesser/ufunclab), along with `arg_minmax`, `min_argmin` and `max_argmax`. See README.md starting here: https://github.com/WarrenWeckesser/ufunclab#minmax
For those familiar with C and gufunc implementation details, you can find the implementations in
https://github.com/WarrenWeckesser/ufunclab/blob/main/src/minmax/minmax_gufu... . You'll see that, as far as gufuncs go, these are not very sophisticated. They do not include implementations for all the NumPy data types, and I haven't yet spent much time on optimization.
Thanks for sharing Warren. While that is interesting code, for inclusion in NumPy purposes I'd much prefer to see something along the suggestions by Eric and Marten on https://github.com/numpy/numpy/issues/9836 about making it easier to combine existing ufuncs. Adding 500 LoC for a `minmax` (separate from the discussion on whether we want such fused operators) does not seem healthy. It's also a new potential source of bugs, because `minmax` isn't going to yield exactly the same numerical values as `min` and `max` used separately if you do it as a gufunc, and any bug fixes will need to be made in two places. Also, given that `min` and `max` use SIMD instructions, the gufunc `minmax` as you have it now is probably substantially slower - see benchmarks in https://github.com/numpy/numpy/pull/20131. Cheers, Ralf