<div dir="ltr"><div><div>Hahaha, that Hyndman story will never get old. <br><br>FWIW, based on much informal polling, the most common intuition on the topic stems from elementary education: a median of an even-numbered set is the mean of the two central values. So, linear-weighted average on discontinuities seems to be least surprising.<br><br></div>Whichever type is chosen, quantiles are often computed in sets. For instance, min/max/median, quartiles (+ interquartile range), and percentiles. Quantiles was one of the main reasons statsutils uses a class[1] to wrap datasets. Otherwise, there's a lot of work in resorting. All the galloping in the world isn't going to beat sorting once. :)<br><br></div><div>Other calculations benefit from this cached approach, too. Variance is faster to calculate after calculating stddev, for instance, but if memory serves, quantiles are the most expensive for mid-sized datasets that don't call for pandas/numpy.<br><br>[1]: <a href="http://boltons.readthedocs.io/en/latest/statsutils.html#boltons.statsutils.Stats">http://boltons.readthedocs.io/en/latest/statsutils.html#boltons.statsutils.Stats</a><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Sat, Mar 17, 2018 at 9:28 AM, Tim Peters <span dir="ltr"><<a href="mailto:tim.peters@gmail.com" target="_blank">tim.peters@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">[Guido]<br>

<span class="">> Since Python is not held to backwards compatibility with S, and for most<br>

> datasets (and users) it doesn't matter much, why not ho with the default<br>

> recommended by Hyndman & Fan?<br>

<br>

</span>Here's Hyndman in 2016[1]:<br>

<br>

"""<br>

The main point of our paper was that statistical software should<br>

standardize the definition of a sample quantile for consistency. We<br>

listed 9 different methods that we found in various software packages,<br>

and argued for one of them (type 8). In that sense, the paper was a<br>

complete failure. No major software uses type 8 by default, and the<br>

diversity of definitions continues 20 years later. In fact, the paper<br>

may have had the opposite effect to what was intended. We drew<br>

attention to the many approaches to computing sample quantiles and<br>

several software products added them all as options. Our own quantile<br>

function for R allows all 9 to be computed, and has type 7 as default<br>

(for backwards consistency – the price we had to pay to get R core to<br>

agree to include our function).<br>

"""<br>

<br>

Familiar & hilarious ;-)<br>

<br>

[1] <a href="https://robjhyndman.com/hyndsight/sample-quantiles-20-years-later/" rel="noreferrer" target="_blank">https://robjhyndman.com/<wbr>hyndsight/sample-quantiles-20-<wbr>years-later/</a><br>

<div class="HOEnZb"><div class="h5">______________________________<wbr>_________________<br>

Python-ideas mailing list<br>

<a href="mailto:Python-ideas@python.org">Python-ideas@python.org</a><br>

<a href="https://mail.python.org/mailman/listinfo/python-ideas" rel="noreferrer" target="_blank">https://mail.python.org/<wbr>mailman/listinfo/python-ideas</a><br>

Code of Conduct: <a href="http://python.org/psf/codeofconduct/" rel="noreferrer" target="_blank">http://python.org/psf/<wbr>codeofconduct/</a><br>

</div></div></blockquote></div><br></div>