[Python-ideas] a sorting protocol dunder method?

Steven D'Aprano steve at pearwood.info
Mon Dec 4 07:16:11 EST 2017


On Mon, Dec 04, 2017 at 12:06:38PM +0100, Antoine Pitrou wrote:

> There are definitely advantages.  Sorting calls __lt__ for each
> comparison (that is, O(n log n) times) while __key__ would only be
> called once per item at the start (that is, O(n) times).

Passing a key function doesn't magically turn a O(n log n) comparison 
sort into a O(n) sort.

Once you've called the key function every time, you still have to 
*actually sort*, which will be up to O(n log n) calls to __lt__ on 
whatever __key__ returned. The key function is just a built-in version 
of the old "DSU" (Decorate-Sort-Undecorate) idiom:


    values = [(key(x), x) for x in values]
    values.sort()
    values = [t[1] for t in values]


If you want this functionality, go right ahead and give your class a 
sortkey method, and then pass that as the explicit key function to 
sorted:

    sorted(collection_of_Spam, key=Spam.sortkey)

That is nice and explicit, and the method only gets looked up once. 
It works in any version of Python, it is fully backwards-compatible, 
and it requires no support from the interpreter.

I still think this is a poor object design, putting the key function in 
the object being sorted rather than in the report doing the sorting, but 
so long as this isn't blessed by the language as the One Obvious Way I 
don't mind so much.



> > It will be rather confusing if 
> > different methods of sorting produce inconsistent order.
> 
> If __key__ is inconsistent with __lt__, it is the same error as making
> __lt__ inconsistent with __gt__.

You seem to be assuming that __key__ is another way of spelling __lt__, 
rather than being a key function. If it does the same thing as __lt__, 
it is a comparison method, not a key function, and the name is horribly 
misleading.

In any case, it isn't an error for __lt__ to be inconsistent with 
__gt__.


py> {1, 2, 3, 4} < {2, 3, 4, 5}
False
py> {1, 2, 3, 4} > {2, 3, 4, 5}
False


Not all values have total ordering.


> And there could be a decorator that generates all comparison methods
> from __key__.

Defining a single comparison method is not enough to define the rest. 
You need __eq__ and one comparison method. (Technically we could use 
__ne__ and one other, but __eq__ is usual.

But why not just define __lt__ and use total_ordering, instead of 
defining two identical decorators that differ only in the name of the 
dunder method they use?

The purpose of __key__ was supposed to be to eliminate the need to 
define __lt__. It seems ridiculous to then use __key__ to define __lt__ 
when the whole point of __key__ is to avoid needing to define __lt__.



-- 
Steve


More information about the Python-ideas mailing list