On Mon, Jan 7, 2019 at 6:50 AM Steven D'Aprano <steve@pearwood.info> wrote:
> I'll provide a suggested batch on the bug.  It will simply be a wholly
> different implementation of median and friends.

I ask for a documentation patch and you start talking about a whole new
implementation. Huh.
A new implementation with precisely the same behaviour is a waste of
time, so I presume you're planning to change the behaviour. How about if
you start off by explaining what the new semantics are?

I think it would be counter-productive to document the bug (as something other than a bug).  Picking what is a completely arbitrary element in face of a non-total order can never be "correct" behavior, and is never worth preserving for compatibility.  I think the use of statistics.median against partially ordered elements is simply rare enough that no one tripped against it, or at least no one reported it before.

Notice that the code itself pretty much recognizes the bug in this comment:

# FIXME: investigate ways to calculate medians without sorting? Quickselect?
 
So it seems like the original author knew the implementation was wrong.  But you're right, the new behavior needs to be decided.  Propagating NaNs is reasonable.  Filtering out NaN's is reasonable.  Those are the default behaviors of NumPy and Pandas, respectively:

np.median([1,2,3,nan]) # -> nan
pd.Series([1,2,3,nan]).median() # -> 2.0

(Yes, of course there are ways in each to get the other behavior).  Other non-Python tools similarly suggest one of those behaviors, but really nothing else.

So yeah, what I was suggesting as a patch was an implementation that had PROPAGATE and IGNORE semantics.  I don't have a real opinion about which should be the default, but the current behavior should simply not exist at all.  As I think about it, warnings and exceptions are really too complex an API for this module.  It's not hard to manually check for NaNs and generate those in your own code.

--
Keeping medicines from the bloodstreams of the sick; food
from the bellies of the hungry; books from the hands of the
uneducated; technology from the underdeveloped; and putting
advocates of freedom in prisons.  Intellectual property is
to the 21st century what the slave trade was to the 16th.