[Python-ideas] Make max() stable
Andrew Barnert
abarnert at yahoo.com
Sat Jan 18 01:09:55 CET 2014
On Jan 17, 2014, at 15:34, Terry Reedy <tjreedy at udel.edu> wrote:
> On 1/17/2014 8:45 AM, אלעזר wrote:
>> Hi all,
>>
>> Given several objects with the same key, max() returns the first one:
>>
>> >>> key = lambda x: 0
>> >>> max(1, 2, key=key)
>> 1
>
> As documented: "If multiple items are maximal, the function returns the first one encountered."
>
>> This means it is not stable, at least according to the definition in
>> "Elements of Programming" by Alexander Stepanov and Paul McJones (pg. 52):
>>
>> "Informally, an algorithm is stable if it respects the original order of
>> equivalent objects.
>
> Min and max are inherently functions of multisets, with order irrelevant but duplicate values allowed.
I'm not sure that's necessarily true. The maximal value of a sequence makes every bit as much sense as the maximal value of a set or multiset, and I think it comes up quite often. For example, if I have a series of experiments at different times, the (time, value) pairs have an obvious meaningful order, and asking for max(experiments, key=itemgetter(1)) is a meaningful thing to do.
But often, even for a sequence, you don't care which max you get.
And often, when you do care, you explicitly want the first. The high score on a video game belongs to the first person who reached that score, not to someone who later tied him.
Sure, _sometimes_ you want the last rather than the first. I've actually written variants of max, nlargest, groupby, etc. that track the last value instead of the first (e.g., to make groupby treat adjacent runs as a group) multiple times. But I wouldn't expect that to be the default behavior of any of those functions.
So, I think the existing design of all these functions is less surprising than the alternative, and adequately documented, and it's easy enough to write the alternative when you need it.
> In any case, changing the definition and implementation of max will break any code that depends on returning first versus last.
This is about as perfect a reason as possible. If it ever matters, we can't change it; if it never matters, we have no reason to change it...
More information about the Python-ideas
mailing list