On Mon, Dec 04, 2017 at 01:52:19PM +0100, Antoine Pitrou wrote:
On Mon, 4 Dec 2017 23:16:11 +1100 Steven D'Aprano
wrote: On Mon, Dec 04, 2017 at 12:06:38PM +0100, Antoine Pitrou wrote:
There are definitely advantages. Sorting calls __lt__ for each comparison (that is, O(n log n) times) while __key__ would only be called once per item at the start (that is, O(n) times).
Passing a key function doesn't magically turn a O(n log n) comparison sort into a O(n) sort.
Where did I say it did?
See the text from you quoted above. You said there are "definitely [performance] advantages" by using a key function. You then compare: - calling __lt__ O(n log n) times, versus - calling the key function O(n) times. This is a classic "apples versus oranges" comparison. You compare *actually sorting the list* with *not sorting the list* and conclude that they key function provides a performance advantage. Yes, the key function gets called O(n) times. And that's not enough to sort the list, you still have to actually sort, exactly as I said. So where do these performance advantages come from? As I said in another post, the overhead of decorating the list with the key function makes it rather unlikely that this will be faster than just sorting it. It can happen, if key(x).__lt__ is sufficiently faster than x.__lt__, but that would be unusual.
Once you've called the key function every time, you still have to *actually sort*, which will be up to O(n log n) calls to __lt__ on whatever __key__ returned.
Yes... and the whole point is for __key__ to return something which is very cheap to compare, such that there are O(n) expensive calls to __key__ and O(n log n) cheap calls to __lt__, rather than O(n log n) expensive calls to __lt__.
Read Chris' post again. The point he was making is that the class might only define __lt__ in order to support sorting, and if we allow it to define a key function instead the class can avoid adding __lt__ at all. There's no requirement or expectation that __lt__ is expensive. If you can order the values using a cheap method and an expensive method, why would you define __lt__ to use the expensive method instead of the cheap method? The point is to avoid defining __lt__ at all, and still support sorting. But even if we define both... what makes you think that x.__lt__ is expensive (if it exists) and key(x).__lt__ is cheap? It might be the other way. If they are different, there's no guarantee about which is cheaper. If they are the same, then one is redundant.
If __key__ is inconsistent with __lt__, it is the same error as making __lt__ inconsistent with __gt__.
You seem to be assuming that __key__ is another way of spelling __lt__, rather than being a key function.
It isn't.
Right -- you've now clarified your position. Thank you. It wasn't clear from your earlier posts.
In any case, it isn't an error for __lt__ to be inconsistent with __gt__. [...] Not all values have total ordering.
As usual you should try to understand what people are trying to say instead of imagining they are incompetent.
Instead of getting your nose out of joint and accusing me of "imagining [you] are incompetent", and assuming that I didn't "try to understand what people are trying to say", how about *you* do the same? Don't assume I'm an idiot too stupid to understand your perfectly clear words, rather consider the possibility that maybe I'm reading and responding to you in good faith, but I'm not a mind-reader. If you failed to get your message across, perhaps the fault lies in your post, not my reading comprehension. In any case, whoever is to blame for the misunderstanding, the only one who can do anything about it is the writer. The writer should take responsibility for not being clear enough, rather than blaming the reader. [...]
Defining a single comparison method is not enough to define the rest.
How about you stick to the discussion? I'm talking about deriving comparison methods from __key__, not from another comparison method. Defining __key__ is definitely enough to define all comparison methods.
Here's a key function I've used: def key(string): return string.strip().casefold() Convert to a key method: def __key__(self): return self.strip().casefold() Is it your idea to define __lt__ and others like this? def __lt__(self, other): # ignoring the possibility of returning NotImplemented return self.__key__() < self.__key__() Fair enough. I'm not convinced that's going to offer definite performance advantages, but like the total_ordering decorator, presumably if we're using this, performance is secondary to convenience. Nor do I think this decorator needs to take an implicit dunder method, when it can take an explicit key function. -- Steve