<div dir="ltr"><div><div>Hahaha, that Hyndman story will never get old. <br><br>FWIW, based on much informal polling, the most common intuition on the topic stems from elementary education: a median of an even-numbered set is the mean of the two central values. So, linear-weighted average on discontinuities seems to be least surprising.<br><br></div>Whichever type is chosen, quantiles are often computed in sets. For instance, min/max/median, quartiles (+ interquartile range), and percentiles. Quantiles was one of the main reasons statsutils uses a class[1] to wrap datasets. Otherwise, there's a lot of work in resorting. All the galloping in the world isn't going to beat sorting once. :)<br><br></div><div>Other calculations benefit from this cached approach, too. Variance is faster to calculate after calculating stddev, for instance, but if memory serves, quantiles are the most expensive for mid-sized datasets that don't call for pandas/numpy.<br><br>[1]: <a href="http://boltons.readthedocs.io/en/latest/statsutils.html#boltons.statsutils.Stats">http://boltons.readthedocs.io/en/latest/statsutils.html#boltons.statsutils.Stats</a><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Sat, Mar 17, 2018 at 9:28 AM, Tim Peters <span dir="ltr"><<a href="mailto:tim.peters@gmail.com" target="_blank">tim.peters@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">[Guido]<br>
<span class="">> Since Python is not held to backwards compatibility with S, and for most<br>
> datasets (and users) it doesn't matter much, why not ho with the default<br>
> recommended by Hyndman & Fan?<br>
<br>
</span>Here's Hyndman in 2016[1]:<br>
<br>
"""<br>
The main point of our paper was that statistical software should<br>
standardize the definition of a sample quantile for consistency. We<br>
listed 9 different methods that we found in various software packages,<br>
and argued for one of them (type 8). In that sense, the paper was a<br>
complete failure. No major software uses type 8 by default, and the<br>
diversity of definitions continues 20 years later. In fact, the paper<br>
may have had the opposite effect to what was intended. We drew<br>
attention to the many approaches to computing sample quantiles and<br>
several software products added them all as options. Our own quantile<br>
function for R allows all 9 to be computed, and has type 7 as default<br>
(for backwards consistency – the price we had to pay to get R core to<br>
agree to include our function).<br>
"""<br>
<br>
Familiar & hilarious ;-)<br>
<br>
[1] <a href="https://robjhyndman.com/hyndsight/sample-quantiles-20-years-later/" rel="noreferrer" target="_blank">https://robjhyndman.com/<wbr>hyndsight/sample-quantiles-20-<wbr>years-later/</a><br>
<div class="HOEnZb"><div class="h5">______________________________<wbr>_________________<br>
Python-ideas mailing list<br>
<a href="mailto:Python-ideas@python.org">Python-ideas@python.org</a><br>
<a href="https://mail.python.org/mailman/listinfo/python-ideas" rel="noreferrer" target="_blank">https://mail.python.org/<wbr>mailman/listinfo/python-ideas</a><br>
Code of Conduct: <a href="http://python.org/psf/codeofconduct/" rel="noreferrer" target="_blank">http://python.org/psf/<wbr>codeofconduct/</a><br>
</div></div></blockquote></div><br></div>