[Python-ideas] Users of statistics software, what quantile functionality would be useful for you?

Guido van Rossum guido at python.org
Sun Apr 28 11:33:57 EDT 2019


On Sat, Apr 27, 2019 at 7:51 AM Steven D'Aprano <steve at pearwood.info> wrote:

> The statistics module is soon to get a quantile function.
>
> For those of you who use statistics software (whether in Python, or
> using other languages) and specifically use quantiles, what sort of
> functionality would be useful to you?
>
> For example:
>
> - evenly-spaced quantiles (say, at 0.25, 0.5, 0.75)?
> - unevenly-spaced quantiles (0.25, 0.8, 0.9, 0.995)?
> - one quantile at a time?
> - any specific definition?
> - quantiles of a distribution?
> - anything else?
>

The stats that are pored over by my team every week are running times of
mypy in various configurations. We currently show p25, p50, p75, p90, p95
and p99. We currently use the following definition:

def pick(data: List[float], fraction: float) -> float:
    index = int(len(data) * fraction)
    before = data[max(0, index - 1)]
    after = data[min(len(data) - 1, index)]
    return (before + after) / 2.0

where `data` is a sorted array. Essentially we use the average of the two
values nearest the cutoff point, except for edge cases. (I think we could
do better, but this is the code I found in our repo. :-)

-- 
--Guido van Rossum (python.org/~guido)
*Pronouns: he/him/his **(why is my pronoun here?)*
<http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20190428/d9c2701a/attachment.html>


More information about the Python-ideas mailing list