[David Mertz <mertz@gnosis.cx>]
I think consistent NaN-poisoning would be excellent behavior. It will always make sense for median (and its variants).
statistics.mode([2, 2, nan, nan, nan]) nan statistics.mode([2, 2, inf - inf, inf - inf, inf - inf]) 2
But in the mode case, I'm not sure we should ALWAYS treat a NaN as poisoning the result.
I am: I thought about the following but didn't write about it because it's too strained to be of actual sane use ;-)
If NaN means "missing value" then sometimes it could change things, ?and we shouldn't guess. But what if it cannot?
>>> statistics.mode([9, 9, 9, 9, nan1, nan2, nan3])
No matter what missing value we take those nans to maybe-possibly represent, 9 is still the most common element. This is only true when the most common thing occurs at least as often as the 2nd most common thing PLUS the number of all NaNs. But in that case, 9 really is the mode.
See "too strained" above. It's equally true that, e.g., the _median_ of your list above: [9, 9, 9, 9, nan1, nan2, nan3] is also 9 regardless of what values are plugged in for the nans. That may be easier to realize at first with a simpler list, like [5, 5, nan] It sounds essentially useless to me, just theoretically possible to make a mess of implementations to cater to. "The right" (obvious, unsurprising, useful, easy to implement, easy to understand) non-exceptional behavior in the presence of NaNs is to pretend they weren't in the list to begin with. But I'd rather ;people ask for that _if_ that's what they want.