Re: [Python-ideas] NAN handling in the statistics module

[David Mertz <david.mertz@gmail.com>]
I'm not clear on what "stable" could mean in the absence of a total ordering. Not only does sort not assume __lt__ is a total ordering, it doesn't assume it's transitive, or even deterministic. We really can't assume anything about potentially user-defined functions. What sort does guarantee is that the result list is some permutation of the input list, regardless of how insanely __lt__ may behave. If __lt__ sanely defines a deterministic total order, then "stable" and "sorted" are guaranteed too, with their obvious meanings.

OK, let me be more precise. Obviously if the implementation in a class is: class Foo: def __lt__(self, other): return random.random() < 0.5 Then we aren't going to rely on much. * If comparison of any two items in a list (under __lt__) is deterministic, is the resulting sort order deterministic? (Pretty sure this is a yes) * If the pairwise comparisons are deterministic, is sorting idempotent? This statement is certainly false: * If two items are equal, and pairwise inequality is deterministic, exchanging the items does not affect the sorting of other items in the list. On Sun, Jan 6, 2019 at 11:09 PM Tim Peters <tim.peters@gmail.com> wrote:
-- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

This statement is certainly false:
* If two items are equal, and pairwise inequality is deterministic, exchanging the items does not affect the sorting of other items in the list.
Just to demonstrate this obviousness:
The classes involved are: class A: def __lt__(self, other): return False __gt__ = __lt__ def __eq__(self, other): return True def __repr__(self): return self.__class__.__name__ class B(A): def __lt__(self, other): return True __gt__ = __lt__ I do not think these are useful, but __lt__ is deterministic here. -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

On Mon, Jan 7, 2019 at 3:19 PM David Mertz <mertz@gnosis.cx> wrote:
If you guarantee that exactly one of "x < y" and "y < x" is true for any given pair of values from the list, and further guarantee that if x < y and y < z then x < z, you have a total order. Without those two guarantees, you could have deterministic comparisons (eg "nan < 5" is always false, but so is "5 < nan"), but there's no way to truly put the elements "in order". Defining __lt__ as "rock < paper", "paper < scissors", "scissors < rock" means that you can't guarantee the sort order, nor determinism. Are those guarantees safe for your purposes? If so, sort() is, AIUI, guaranteed to behave sanely. ChrisA

[David Mertz <mertz@gnosis.cx>]
Yes, but not defined unless __lt__ also defines a total ordering.
* If the pairwise comparisons are deterministic, is sorting idempotent?
Not necessarily. For example, the 2-element list here swaps its elements every time `.sort()` is invoked, because the second element always claims it's "less than" the first element, regardless of which order they're in: class RelentlesslyTiny: def __init__(self, name): self.name = name def __repr__(self): return self.name def __lt__(self, other): return self is not other a = RelentlesslyTiny("A") b = RelentlesslyTiny("B") xs = [a, b] print(xs) xs.sort() print("after sorting once", xs) xs.sort() print("after sorting twice", xs) [A, B] after sorting once [B, A] after sorting twice [A, B]
What I said at the start ;-) The only thing .sort() always guarantees regardless of how goofy __lt__ may be is that the result list will be some permutation of the input list. This is so even if __lt__ raises an uncaught exception, killing the sort mid-stream.

OK, let me be more precise. Obviously if the implementation in a class is: class Foo: def __lt__(self, other): return random.random() < 0.5 Then we aren't going to rely on much. * If comparison of any two items in a list (under __lt__) is deterministic, is the resulting sort order deterministic? (Pretty sure this is a yes) * If the pairwise comparisons are deterministic, is sorting idempotent? This statement is certainly false: * If two items are equal, and pairwise inequality is deterministic, exchanging the items does not affect the sorting of other items in the list. On Sun, Jan 6, 2019 at 11:09 PM Tim Peters <tim.peters@gmail.com> wrote:
-- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

This statement is certainly false:
* If two items are equal, and pairwise inequality is deterministic, exchanging the items does not affect the sorting of other items in the list.
Just to demonstrate this obviousness:
The classes involved are: class A: def __lt__(self, other): return False __gt__ = __lt__ def __eq__(self, other): return True def __repr__(self): return self.__class__.__name__ class B(A): def __lt__(self, other): return True __gt__ = __lt__ I do not think these are useful, but __lt__ is deterministic here. -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

On Mon, Jan 7, 2019 at 3:19 PM David Mertz <mertz@gnosis.cx> wrote:
If you guarantee that exactly one of "x < y" and "y < x" is true for any given pair of values from the list, and further guarantee that if x < y and y < z then x < z, you have a total order. Without those two guarantees, you could have deterministic comparisons (eg "nan < 5" is always false, but so is "5 < nan"), but there's no way to truly put the elements "in order". Defining __lt__ as "rock < paper", "paper < scissors", "scissors < rock" means that you can't guarantee the sort order, nor determinism. Are those guarantees safe for your purposes? If so, sort() is, AIUI, guaranteed to behave sanely. ChrisA

[David Mertz <mertz@gnosis.cx>]
Yes, but not defined unless __lt__ also defines a total ordering.
* If the pairwise comparisons are deterministic, is sorting idempotent?
Not necessarily. For example, the 2-element list here swaps its elements every time `.sort()` is invoked, because the second element always claims it's "less than" the first element, regardless of which order they're in: class RelentlesslyTiny: def __init__(self, name): self.name = name def __repr__(self): return self.name def __lt__(self, other): return self is not other a = RelentlesslyTiny("A") b = RelentlesslyTiny("B") xs = [a, b] print(xs) xs.sort() print("after sorting once", xs) xs.sort() print("after sorting twice", xs) [A, B] after sorting once [B, A] after sorting twice [A, B]
What I said at the start ;-) The only thing .sort() always guarantees regardless of how goofy __lt__ may be is that the result list will be some permutation of the input list. This is so even if __lt__ raises an uncaught exception, killing the sort mid-stream.
participants (3)
-
Chris Angelico
-
David Mertz
-
Tim Peters