Here is an implementation that:
A. Only relies on '<'
B. Has no special knowledge of NaN
C. Does not use sorted() or .sort() for anything
D. Is Pandas-like in choosing among only comparable items
E. Only picks an element from the original iterator, does not take mean of two candidates
(I.e. kinda like statistic.median_low() without the NaN bug)
F. Is barely tested and not-at-all benchmarked. I'm sure Timsort is way faster than this kinda-like-quicksort approach.
>>> even_names = ['Bob', 'Alice', 'Charlie', 'Zack', 'Xavier', 'Mary']
>>> odd_names = names + ['Francis']
>>> nums = [2, 3, 4, nan, 1]
>>> %paste
def median(it, *, _offset=0):
smalls = []
bigs = []
go_big = False
it = iter(it)
candidate = next(it)
for x in it:
if x < candidate:
smalls.append(x)
elif candidate < x:
bigs.append(x)
elif candidate == x:
if go_big:
bigs.append(x)
else:
smalls.append(x)
go_big = not go_big
if abs(len(smalls) - len(bigs) + _offset) <= 1:
return candidate
elif len(smalls) >= len(bigs):
offset = -len(bigs)
if bigs:
smalls.append(min(bigs))
return median(smalls, _offset=offset)
else:
offset = len(smalls)
if smalls:
bigs.append(max(smalls))
return median(bigs, _offset=offset)
## -- End pasted text --
>>> statistics.median_low(odd_names), median(odd_names)
('Francis', 'Francis')
>>> statistics.median_low(even_names), median(even_names)
('Charlie', 'Charlie')
>>> statistics.median_low(nums), median(nums)
(4, 2)
--
Keeping medicines from the bloodstreams of the sick; food
from the bellies of the hungry; books from the hands of the
uneducated; technology from the underdeveloped; and putting
advocates of freedom in prisons. Intellectual property is
to the 21st century what the slave trade was to the 16th.