[David Mertz david.mertz@gmail.com]
Thanks Tim for clarifying. Is it even the case that sorts are STABLE in the face of non-total orderings under __lt__? A couple quick examples don't refute that, but what I tried was not very thorough, nor did I think much about TimSort itself.
I'm not clear on what "stable" could mean in the absence of a total ordering. Not only does sort not assume __lt__ is a total ordering, it doesn't assume it's transitive, or even deterministic. We really can't assume anything about potentially user-defined functions.
What sort does guarantee is that the result list is some permutation of the input list, regardless of how insanely __lt__ may behave. If __lt__ sanely defines a deterministic total order, then "stable" and "sorted" are guaranteed too, with their obvious meanings.
OK, let me be more precise. Obviously if the implementation in a class is:
class Foo: def __lt__(self, other): return random.random() < 0.5
Then we aren't going to rely on much.
* If comparison of any two items in a list (under __lt__) is deterministic, is the resulting sort order deterministic? (Pretty sure this is a yes) * If the pairwise comparisons are deterministic, is sorting idempotent?
This statement is certainly false:
* If two items are equal, and pairwise inequality is deterministic, exchanging the items does not affect the sorting of other items in the list.
On Sun, Jan 6, 2019 at 11:09 PM Tim Peters tim.peters@gmail.com wrote:
[David Mertz david.mertz@gmail.com]
Thanks Tim for clarifying. Is it even the case that sorts are STABLE in the face of non-total orderings under __lt__? A couple quick examples don't refute that, but what I tried was not very thorough, nor did I think much about TimSort itself.
I'm not clear on what "stable" could mean in the absence of a total ordering. Not only does sort not assume __lt__ is a total ordering, it doesn't assume it's transitive, or even deterministic. We really can't assume anything about potentially user-defined functions.
What sort does guarantee is that the result list is some permutation of the input list, regardless of how insanely __lt__ may behave. If __lt__ sanely defines a deterministic total order, then "stable" and "sorted" are guaranteed too, with their obvious meanings.
This statement is certainly false:
- If two items are equal, and pairwise inequality is deterministic,
exchanging the items does not affect the sorting of other items in the list.
Just to demonstrate this obviousness:
sorted([9, 9, 9, b, 1, 2, 3, a])
[1, 2, 3, A, B, 9, 9, 9]
sorted([9, 9, 9, a, 1, 2, 3, b])
[B, 9, 9, 9, A, 1, 2, 3]
a == b
True
The classes involved are:
class A: def __lt__(self, other): return False __gt__ = __lt__ def __eq__(self, other): return True def __repr__(self): return self.__class__.__name__
class B(A): def __lt__(self, other): return True __gt__ = __lt__
I do not think these are useful, but __lt__ is deterministic here.
On Mon, Jan 7, 2019 at 3:19 PM David Mertz mertz@gnosis.cx wrote:
OK, let me be more precise. Obviously if the implementation in a class is:
class Foo: def __lt__(self, other): return random.random() < 0.5
Then we aren't going to rely on much.
- If comparison of any two items in a list (under __lt__) is deterministic, is the resulting sort order deterministic? (Pretty sure this is a yes)
If you guarantee that exactly one of "x < y" and "y < x" is true for any given pair of values from the list, and further guarantee that if x < y and y < z then x < z, you have a total order. Without those two guarantees, you could have deterministic comparisons (eg "nan < 5" is always false, but so is "5 < nan"), but there's no way to truly put the elements "in order". Defining __lt__ as "rock < paper", "paper < scissors", "scissors < rock" means that you can't guarantee the sort order, nor determinism.
Are those guarantees safe for your purposes? If so, sort() is, AIUI, guaranteed to behave sanely.
ChrisA
[David Mertz mertz@gnosis.cx]
OK, let me be more precise. Obviously if the implementation in a class is:
class Foo: def __lt__(self, other): return random.random() < 0.5
Then we aren't going to rely on much.
- If comparison of any two items in a list (under __lt__) is deterministic, is
the resulting sort order deterministic? (Pretty sure this is a yes)
Yes, but not defined unless __lt__ also defines a total ordering.
- If the pairwise comparisons are deterministic, is sorting idempotent?
Not necessarily. For example, the 2-element list here swaps its elements every time `.sort()` is invoked, because the second element always claims it's "less than" the first element, regardless of which order they're in:
class RelentlesslyTiny: def __init__(self, name): self.name = name def __repr__(self): return self.name def __lt__(self, other): return self is not other
a = RelentlesslyTiny("A") b = RelentlesslyTiny("B") xs = [a, b] print(xs) xs.sort() print("after sorting once", xs) xs.sort() print("after sorting twice", xs)
[A, B] after sorting once [B, A] after sorting twice [A, B]
This statement is certainly false:
- If two items are equal, and pairwise inequality is deterministic, exchanging
the items does not affect the sorting of other items in the list.
What I said at the start ;-) The only thing .sort() always guarantees regardless of how goofy __lt__ may be is that the result list will be some permutation of the input list. This is so even if __lt__ raises an uncaught exception, killing the sort mid-stream.