
Hello, I wonder if it would be worth to enhance max, min, argmax and argmin (more?) with a tie breaking parameter: If multiple entries have the same value the first value is returned by now. It would be useful to have a parameter to alter this behavior to an arbitrary tie-breaking. I would propose, that the tie-breaking function gets a list with all indices of the max/mins. Example:
a = np.array([ 1, 2, 5, 5, 2, 1]) np.argmax(a, tie_breaking=random.choice) 3
np.argmax(a, tie_breaking=random.choice) 2
np.argmax(a, tie_breaking=random.choice) 2
np.argmax(a, tie_breaking=random.choice) 2
np.argmax(a, tie_breaking=random.choice) 3
Especially for some randomized experiments it is necessary that not always the first maximum is returned, but a random optimum. Thus I end up writing these things over and over again. I understand, that max and min are crucial functions, which shouldn't be slowed down by the proposed changes. Adding new functions instead of altering the existing ones would be a good option. Are there any concerns against me implementing these things and sending a pull request? Should such a function better be included in scipy for example? Best, Johannes

On Thu, Mar 12, 2015 at 1:31 PM, Johannes Kulick < johannes.kulick@ipvs.uni-stuttgart.de> wrote:
Hello,
I wonder if it would be worth to enhance max, min, argmax and argmin
with a tie breaking parameter: If multiple entries have the same value
value is returned by now. It would be useful to have a parameter to alter
behavior to an arbitrary tie-breaking. I would propose, that the tie-breaking function gets a list with all indices of the max/mins.
Example:
a = np.array([ 1, 2, 5, 5, 2, 1]) np.argmax(a, tie_breaking=random.choice) 3
np.argmax(a, tie_breaking=random.choice) 2
np.argmax(a, tie_breaking=random.choice) 2
np.argmax(a, tie_breaking=random.choice) 2
np.argmax(a, tie_breaking=random.choice) 3
Especially for some randomized experiments it is necessary that not always the first maximum is returned, but a random optimum. Thus I end up writing
things over and over again.
I understand, that max and min are crucial functions, which shouldn't be slowed down by the proposed changes. Adding new functions instead of altering the existing ones would be a good option.
Are there any concerns against me implementing these things and sending a
(more?) the first this these pull
request? Should such a function better be included in scipy for example?
On the whole, I think I would prefer new functions for this. I assume you only need variants for argmin() and argmax() and not min() and max(), since all of the tied values for the latter two would be identical, so returning the first one is just as good as any other. -- Robert Kern

On 03/12/2015 02:42 PM, Robert Kern wrote:
On Thu, Mar 12, 2015 at 1:31 PM, Johannes Kulick <johannes.kulick@ipvs.uni-stuttgart.de <mailto:johannes.kulick@ipvs.uni-stuttgart.de>> wrote:
Hello,
I wonder if it would be worth to enhance max, min, argmax and argmin
with a tie breaking parameter: If multiple entries have the same value
value is returned by now. It would be useful to have a parameter to alter this behavior to an arbitrary tie-breaking. I would propose, that the tie-breaking function gets a list with all indices of the max/mins.
Example:
a = np.array([ 1, 2, 5, 5, 2, 1]) np.argmax(a, tie_breaking=random.choice) 3
np.argmax(a, tie_breaking=random.choice) 2
np.argmax(a, tie_breaking=random.choice) 2
np.argmax(a, tie_breaking=random.choice) 2
np.argmax(a, tie_breaking=random.choice) 3
Especially for some randomized experiments it is necessary that not always the first maximum is returned, but a random optimum. Thus I end up writing
(more?) the first these
things over and over again.
I understand, that max and min are crucial functions, which shouldn't be slowed down by the proposed changes. Adding new functions instead of altering the existing ones would be a good option.
Are there any concerns against me implementing these things and sending a pull request? Should such a function better be included in scipy for example?
On the whole, I think I would prefer new functions for this. I assume you only need variants for argmin() and argmax() and not min() and max(), since all of the tied values for the latter two would be identical, so returning the first one is just as good as any other.
is this such a common usecase that its worth a numpy function to replace one liners like this? np.random.choice(np.where(a == a.max())[0]) its also not that inefficient if the number of equal elements is not too large.
participants (3)
-
Johannes Kulick
-
Julian Taylor
-
Robert Kern