[Python-ideas] Adding quantile to the statistics module
steve at pearwood.info
Mon Mar 19 07:36:09 EDT 2018
On Fri, Mar 16, 2018 at 01:36:31PM +0900, Stephen J. Turnbull wrote:
> Steven D'Aprano writes:
> > Indeed. I've been considering quantiles and quartiles for a long time,
> > and I've found at least ten different definitions for quantiles and
> > sixteen for quartiles.
> I'd like to see your list written up.
On checking my notes, there is considerable overlap in the numbers above
(some calculatation methods are equivalent to others) but overall I find
a total of 16 distinct methods in use. Some of those are only suitable
for generating quartiles.
This should not be considered an exhaustive list, I may have missed
some. Additions and corrections will be welcomed :-)
My major sources are Hyndman & Fan:
Langford concentrates on methods of calculating quartiles, while Hyndman
& Fan consider more general quantile methods. Obviously if you have a
general quantile method, you can use it to calculate quartiles.
I have compiled a summary in the following table. Reading across the row
are the (usually numeric) label or parameter used to specify a
calculation method. Entries in the same column are the same calculation
method regardless of the label.
For example, what Hyndman & Fan call method 1, Langford calls method 15,
and the SAS software uses a parameter of 3. The Excel QUARTILE function
is equivalent to what H&F call method 7 and what Langford calls 12.
You will need to use a monospaced font for the columns to line up.
H&F 1 2 3 4 5 6 7 8 9
Langford 15 4 14 13 10 11 12 1 2 5 6 9
Excel 2010+ QE QI
Maple 1 2 3 4 5 6 7 8
Mathematica AQ MQ
R 1 2 3 4 5 6 7 8 9
SAS 3 5 2 1 4
TI calc X
X Only calculation method used by the software.
Q Excel QUARTILE function (pre 2010)
QE Excel QUARTILE.EXC function
QI Excel QUARTILE and QUARTILE.INC functions
AQ Mathematica AsymmetricQuartiles function
MQ Mathematica Quartiles function
Langford's 3 and 7 (not shown) is the same as his 1;
his 8 (not shown) is the same as his 2.
Hyndman & Fan recommend method 8 as the best method for general
Langford (who has certainly read H&F) recommends his method 4, which is
H&F's method 2, as the standard quartile. That is the same as the
default used by SAS.
For what it's worth, the method taught in Australian high schools for
calculating quartiles and interquartile range is Langford's method 2.
That's the method that Texas Instruments calculators use.
I haven't personally confirmed all of the software equivalences, in
particular I'm a bit dubious about the Maple methods. If anyone has
access to Maple and doesn't mind running a few sample calculations for
me, please contact me off-list.
More information about the Python-ideas