On Thu, Mar 05, 2020 at 08:23:22AM -0500, Richard Damon wrote:
Yes, that is the idea of AlmostTotalOrder, to have algorithms that really require a total order (like sorting)
Sorting doesn't require a total order. Sorting only requires a weak order where the only operator required is the "comes before" operator, or less than. That's precisely how sorting in Python is implemented. Here is an interesting discussion of a practical use-case of sorting data with a partial order: https://blog.thecybershadow.net/2018/11/18/d-compilation-is-too-slow-and-i-a...
but we really need to use a type that has these exceptional values. Imagine that sort/median was defined to type check its parameter,
No need to imagine it, sort already type-checks its arguments: py> sorted([1, 3, 5, "Hello", 2]) TypeError: '<' not supported between instances of 'str' and 'int'
and that meant that you couldn't take the median of a list of floats (because float has the NaN value that breaks TotalOrder).
Dealing with NANs depends on what you want to do with the data. If you are sorting for presentation purposes, what you probably want is to sort with a custom key that pushes all the NANs to the front (or rear) of the list. If you are sorting for the purposes of calculating the median, it depends. There are at least three reasonable strategies for median: - ignore the NANs; - return a NAN; - raise an exception. Personally, I think that the first is by far the most practical: if you have NANs in your statistical data, that's probably because they've come from some other library or application that is using them to represent missing values, and if that's the case, the right thing to do is to ignore them. -- Steven