[Numpy-discussion] Proposal for IQR function

Joseph Fox-Rabinovitz jfoxrabinovitz at gmail.com
Thu Jan 28 14:00:25 EST 2016


I have created an IQR function to add to the other dispersion metrics
such as standard deviation. I have described the purpose and nature of
the proposal in PR#7137, so I am pasting the text here as well:

Motivation
----------
This function is used in one place in numpy already (to compute the
Freedman-Diaconis histogram bin estimator) in addition to being
requested on Stack Overflow a couple of times:

  - http://stackoverflow.com/questions/23228244/how-do-you-find-the-iqr-in-numpy
  - http://stackoverflow.com/questions/27472330/how-should-the-interquartile-range-be-calculated-in-python

It is also used in matplotlib for box and violin plots:
http://matplotlib.org/faq/howto_faq.html#interpreting-box-plots-and-violin-plots.
It is a very simple, common and robust dispersion estimator. There
does not appear to be an implementation for it anywhere in numpy or
scipy.

About
---------
This function is a convenience combination of `np.percentile` and
`np.subtract`. As such, it allows the the difference between any two
percentiles to be computed, not necessarily (25, 75), which is the
default. All of the recent enhancements to percentile are used.

The documentation and testing is borrowed heavily from `np.percentile`.

Wikipedia Reference: https://en.wikipedia.org/wiki/Interquartile_range

Note
----------
The tests will not pass until the bug-fix for `np.percentile` kwarg
`interpolation='midpoint'` (#7129) is incorporated and this PR is
rebased.



More information about the NumPy-Discussion mailing list