argmax and argmin to python list
I'd like to propose adding argmax and argmin functions to the python list. These functions return the index of a maximum / minimum element of the list. Eg: a = [1, 4, 2, 3] print(a.argmax()) # 1 print(a.argmin()) # 0 It's a very popular request (based on stackoverflow https://stackoverflow.com/questions/16945518/finding-the-index-of-the-value-... ), and currently there is no elegant way to find it. What do you think?
On 29/08/2020 14:17, Barry Scott wrote:
This has the drawback of passing twice over the list. The following doesn't, but the complexity somewhat makes Filipp's point:
min((e, i) for i, e in enumerate(a))[1] 0
I think one would want argmin() and argmax() to work with general iterables, so I wonder if the stdlib would not be a better home than list itself. I half expected it to be an itertools recipe. The advantage of a recipe is that variations such as needing the last occurrence of the minimum are easily accommodated. Jeff Allen
That is 4x slower then my code for 1,000,000 items. --------------- a.py -------------- import sys import time import random alg = sys.argv[1] size = int(sys.argv[2]) x = [random.randint(0, 1_000_000) for _ in range(size)] start = time.time() if alg == 'barry': m = x.index(min(x)) elif alg == 'jeff': m = min((e, i) for i, e in enumerate(x))[1] end = time.time() print( alg, end-start, (end-start)/size ) ------------------------- Here is the output I got on my laptop. barry 0.022754907608032227 2.2754907608032226e-08 barry 0.03325295448303223 3.325295448303223e-08 barry 0.034243106842041016 3.4243106842041016e-08 barry 0.02784109115600586 2.784109115600586e-08 jeff 0.13722586631774902 1.3722586631774904e-07 jeff 0.1359708309173584 1.359708309173584e-07 jeff 0.13658690452575684 1.3658690452575684e-07
I think one would want argmin() and argmax() to work with general iterables, so I wonder if the stdlib would not be a better home than list itself. I half expected it to be an itertools recipe. The advantage of a recipe is that variations such as needing the last occurrence of the minimum are easily accommodated.
Surely its index_min() and index_max() not argmin() and argmax(). Barry
It's expected that python single iterating would be slower than two times C iterating. Nevertheless one time C iterating will be probably faster than a separate min + index.
My last case was to find the min index of the array part. index_min could accept the starting element index, as index does. a = [1, 4, 1, 2, 3] print(a.index_min()) # 0 print(a.index_min(1)) # 2 On Sun, 30 Aug 2020 at 11:52, Barry Scott <barry@barrys-emacs.org> wrote:
Of course this could be implemented in C in python, but you are going to have to convince the python developers that it’s important enough to want to maintain the code. Is your use of such a function so performance sensitive that you need this in C? Have you considered writing this as a C extension for your own use? Barry
On Sun, Aug 30, 2020 at 7:28 AM Barry <barry@barrys-emacs.org> wrote:
so no -- this should not work with general iterables, indexes don't really make sense for iterables, only Sequences. Is your use of such a function so performance sensitive that you need this
in C? Have you considered writing this as a C extension for your own use?
or use numpy :-) (which is probably where the name "argmin" came from, rather than "index_min") Signature: np.argmin(a, axis=None, out=None) Docstring: Returns the indices of the minimum values along an axis. Parameters ---------- a : array_like Input array. axis : int, optional By default, the index is into the flattened array, otherwise along the specified axis. out : array, optional If provided, the result will be inserted into this array. It should be of the appropriate shape and dtype. Returns ------- index_array : ndarray of ints Array of indices into the array. It has the same shape as `a.shape` with the dimension along `axis` removed The other question of course, is what to do when the minimum value appears more than once -- numpy appears to give the first occurance: In [6]: np.argmin([3,2,1,2,3,4,5,1,2,3]) Out[6]: 2 -CHB -- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
On 31/08/2020 01:14, Christopher Barker wrote:
or use numpy:-) (which is probably where the name "argmin" came from, rather than "index_min")
https://en.wikipedia.org/wiki/Arg_max not sure if numpy precedes my own first memory of these functions in the 1970's, but it's fairly obvious that numpy's argmin is not necessarily set valued which the mathematical use would suggest is the mathematical usage Python 3.8.5 (default, Jul 27 2020, 08:42:51) [GCC 10.1.0] on linux Type "help", "copyright", "credits" or "license" for more information.
this would be {1,4,6} if set valued -boring-ly yrs- Robin Becker
On 30/08/2020 09:51, Barry Scott wrote:
Ha! Fair enough. In principle, however, passing over the list once only is preferable. The trade-off would depend on the cost of doing comparisons: cheap here while iteration is clearly costing a lot. In the alternative, the second pass for index() is probably only doing pointer comparison. It will often be the case and is perhaps why we don't have this function. That and its existence in various libraries e.g. https://iteration-utilities.readthedocs.io/en/latest/generated/argmin.html#i... . Passing over the list once is what we would do in a C implementation, I'm sure. One solution is to add a counter in bltinmodule.c::min_max()). Jeff
participants (7)
-
Antoine Pitrou
-
Barry
-
Barry Scott
-
Christopher Barker
-
Filipp Bakanov
-
Jeff Allen
-
Robin Becker