Re: [Pythonideas] NAN handling in the statistics module
[David Mertz <david.mertz@gmail.com>]
Thanks Tim for clarifying. Is it even the case that sorts are STABLE in the face of nontotal orderings under __lt__? A couple quick examples don't refute that, but what I tried was not very thorough, nor did I think much about TimSort itself.
I'm not clear on what "stable" could mean in the absence of a total ordering. Not only does sort not assume __lt__ is a total ordering, it doesn't assume it's transitive, or even deterministic. We really can't assume anything about potentially userdefined functions. What sort does guarantee is that the result list is some permutation of the input list, regardless of how insanely __lt__ may behave. If __lt__ sanely defines a deterministic total order, then "stable" and "sorted" are guaranteed too, with their obvious meanings.
OK, let me be more precise. Obviously if the implementation in a class is: class Foo: def __lt__(self, other): return random.random() < 0.5 Then we aren't going to rely on much. * If comparison of any two items in a list (under __lt__) is deterministic, is the resulting sort order deterministic? (Pretty sure this is a yes) * If the pairwise comparisons are deterministic, is sorting idempotent? This statement is certainly false: * If two items are equal, and pairwise inequality is deterministic, exchanging the items does not affect the sorting of other items in the list. On Sun, Jan 6, 2019 at 11:09 PM Tim Peters <tim.peters@gmail.com> wrote:
[David Mertz <david.mertz@gmail.com>]
Thanks Tim for clarifying. Is it even the case that sorts are STABLE in the face of nontotal orderings under __lt__? A couple quick examples don't refute that, but what I tried was not very thorough, nor did I think much about TimSort itself.
I'm not clear on what "stable" could mean in the absence of a total ordering. Not only does sort not assume __lt__ is a total ordering, it doesn't assume it's transitive, or even deterministic. We really can't assume anything about potentially userdefined functions.
What sort does guarantee is that the result list is some permutation of the input list, regardless of how insanely __lt__ may behave. If __lt__ sanely defines a deterministic total order, then "stable" and "sorted" are guaranteed too, with their obvious meanings.
 Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.
This statement is certainly false:
* If two items are equal, and pairwise inequality is deterministic, exchanging the items does not affect the sorting of other items in the list.
Just to demonstrate this obviousness:
sorted([9, 9, 9, b, 1, 2, 3, a]) [1, 2, 3, A, B, 9, 9, 9] sorted([9, 9, 9, a, 1, 2, 3, b]) [B, 9, 9, 9, A, 1, 2, 3] a == b True
The classes involved are: class A: def __lt__(self, other): return False __gt__ = __lt__ def __eq__(self, other): return True def __repr__(self): return self.__class__.__name__ class B(A): def __lt__(self, other): return True __gt__ = __lt__ I do not think these are useful, but __lt__ is deterministic here.  Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.
On Mon, Jan 7, 2019 at 3:19 PM David Mertz <mertz@gnosis.cx> wrote:
OK, let me be more precise. Obviously if the implementation in a class is:
class Foo: def __lt__(self, other): return random.random() < 0.5
Then we aren't going to rely on much.
* If comparison of any two items in a list (under __lt__) is deterministic, is the resulting sort order deterministic? (Pretty sure this is a yes)
If you guarantee that exactly one of "x < y" and "y < x" is true for any given pair of values from the list, and further guarantee that if x < y and y < z then x < z, you have a total order. Without those two guarantees, you could have deterministic comparisons (eg "nan < 5" is always false, but so is "5 < nan"), but there's no way to truly put the elements "in order". Defining __lt__ as "rock < paper", "paper < scissors", "scissors < rock" means that you can't guarantee the sort order, nor determinism. Are those guarantees safe for your purposes? If so, sort() is, AIUI, guaranteed to behave sanely. ChrisA
[David Mertz <mertz@gnosis.cx>]
OK, let me be more precise. Obviously if the implementation in a class is:
class Foo: def __lt__(self, other): return random.random() < 0.5
Then we aren't going to rely on much.
* If comparison of any two items in a list (under __lt__) is deterministic, is the resulting sort order deterministic? (Pretty sure this is a yes)
Yes, but not defined unless __lt__ also defines a total ordering.
* If the pairwise comparisons are deterministic, is sorting idempotent?
Not necessarily. For example, the 2element list here swaps its elements every time `.sort()` is invoked, because the second element always claims it's "less than" the first element, regardless of which order they're in: class RelentlesslyTiny: def __init__(self, name): self.name = name def __repr__(self): return self.name def __lt__(self, other): return self is not other a = RelentlesslyTiny("A") b = RelentlesslyTiny("B") xs = [a, b] print(xs) xs.sort() print("after sorting once", xs) xs.sort() print("after sorting twice", xs) [A, B] after sorting once [B, A] after sorting twice [A, B]
This statement is certainly false:
* If two items are equal, and pairwise inequality is deterministic, exchanging the items does not affect the sorting of other items in the list.
What I said at the start ;) The only thing .sort() always guarantees regardless of how goofy __lt__ may be is that the result list will be some permutation of the input list. This is so even if __lt__ raises an uncaught exception, killing the sort midstream.
participants (3)

Chris Angelico

David Mertz

Tim Peters