[Python-ideas] a sorting protocol dunder method?

Antoine Pitrou solipsis at pitrou.net
Mon Dec 4 09:10:38 EST 2017


On Mon, 4 Dec 2017 15:50:31 +0200
Serhiy Storchaka <storchaka at gmail.com>
wrote:
> 
> >> Yes, it is too special-case. I don't see any advantages in comparison
> >> with defining the __lt__ method.  
> > 
> > There are definitely advantages.  Sorting calls __lt__ for each
> > comparison (that is, O(n log n) times) while __key__ would only be
> > called once per item at the start (that is, O(n) times).  
> 
> It will call __lt__ for each key comparison (the same O(n log n) times), 
> but *in addition* it will call __key__ O(n) times. You can get the 
> benefit only when times of calling __key__ is much smaller than the 
> difference between times of calling item's __lt__ and key's __lt__, and 
> maybe only for large lists.

Sure, that's the point: your non-trivial __key__ method reduces the
instance to e.g. a simple tuple or string, and then __lt__ over those
keys is cheap.

> But why not just pass the key argument when 
> you sort the large list?

For the same reason that you want __lt__ (or __eq__, or __hash__) to be
defined on the type, not call it manually every time you want to make a
comparison: because it's really a fundamental property of the type and
it "feels" wrong to have to pass it explicitly.

Also there are library routines which may sort implicitly their inputs
(such as pprint, IIRC, though perhaps pprint only sorts after calling
str() -- I haven't checked).

> If __key__ is consistent with __lt__, then we can just use __lt__, and 
> don't introduce new special methods.

This is ignoring all the other arguments...

> The decorator idea LGTM. But it doesn't need the special __key__ method. 
> Just pass the key function as a decorator argument.

I would find it cleaner to express it as a method in the class's body
("__key__" or anything else) rather than have to pass a function object.
Also, it's easier to unit-test if it officially exists as a method...

> This idea was proposed more than 3.5 years ago. Is there a PyPI package 
> that implements it?

I don't know.  I know I reimplemented such a thing for Numba (but of
course didn't benefit from automatic sort() support), because I needed
fast hashing and equality without implementing the corresponding
methods by hand every time (*). It would be a pity to depend on a
third-party package just for that.

(*) see
https://github.com/numba/numba/blob/master/numba/types/abstract.py#L88

Regards

Antoine.




More information about the Python-ideas mailing list