[... apologies if this is dup, got a bounce ...]

> [David Mertz <mertz@gnosis.cx>]

>> I have to say though that the existing behavior of `statistics.median[_low|_high|]`

>> is SURPRISING if not outright wrong. It is the behavior in existing Python,

>> but it is very strange.

>>

>> The implementation simply does whatever `sorted()` does, which is an

>> implementation detail. In particular, NaN's being neither less than nor

>> greater than any floating point number, just stay where they are during

>> sorting.

>

> I expect you inferred that from staring at a handful of examples, but

> it's illusion. Python's sort uses only __lt__ comparisons, and if

> those don't implement a total ordering then _nothing_ is defined about

> sort's result (beyond that it's some permutation of the original

> list).

Thanks Tim for clarifying. Is it even the case that sorts are STABLE in

the face of non-total orderings under __lt__? A couple quick examples

don't refute that, but what I tried was not very thorough, nor did I

think much about TimSort itself.

> So, certainly, if you want median to be predictable in the presence of

> NaNs, sort's behavior in the presence of NaNs can't be relied on in

> any respect.

Playing with Tim's examples, this suggests that statistics.median() is

simply outright WRONG. I can think of absolutely no way to characterize

these as reasonable results:

Python 3.7.1 | packaged by conda-forge | (default, Nov 13 2018, 09:50:42)

In [4]: statistics.median([9, 9, 9, nan, 1, 2, 3, 4, 5])

Out[4]: 1

In [5]: statistics.median([9, 9, 9, nan, 1, 2, 3, 4])

Out[5]: nan

>> I have to say though that the existing behavior of `statistics.median[_low|_high|]`

>> is SURPRISING if not outright wrong. It is the behavior in existing Python,

>> but it is very strange.

>>

>> The implementation simply does whatever `sorted()` does, which is an

>> implementation detail. In particular, NaN's being neither less than nor

>> greater than any floating point number, just stay where they are during

>> sorting.

>

> I expect you inferred that from staring at a handful of examples, but

> it's illusion. Python's sort uses only __lt__ comparisons, and if

> those don't implement a total ordering then _nothing_ is defined about

> sort's result (beyond that it's some permutation of the original

> list).

Thanks Tim for clarifying. Is it even the case that sorts are STABLE in

the face of non-total orderings under __lt__? A couple quick examples

don't refute that, but what I tried was not very thorough, nor did I

think much about TimSort itself.

> So, certainly, if you want median to be predictable in the presence of

> NaNs, sort's behavior in the presence of NaNs can't be relied on in

> any respect.

Playing with Tim's examples, this suggests that statistics.median() is

simply outright WRONG. I can think of absolutely no way to characterize

these as reasonable results:

Python 3.7.1 | packaged by conda-forge | (default, Nov 13 2018, 09:50:42)

In [4]: statistics.median([9, 9, 9, nan, 1, 2, 3, 4, 5])

Out[4]: 1

In [5]: statistics.median([9, 9, 9, nan, 1, 2, 3, 4])

Out[5]: nan