[Python-Dev] PEP 450 adding statistics module

Oscar Benjamin oscar.j.benjamin at gmail.com
Mon Sep 16 18:45:45 CEST 2013


On 16 September 2013 16:42, Guido van Rossum <guido at python.org> wrote:
> I'm ready to accept this PEP. Because I haven't read this entire thread (and
> 60 messages about random diversions is really too much to try and catch up
> on) I'll give people 24 hours to remind me of outstanding rejections.
>
> I also haven't reviewed the code in any detail, but I believe the code
> review is going well, so I'm not concerned that the PEP would have to
> revised based on that alone.

I think Steven has addressed all of the issues raised. Briefly from memory:

1) There was concern about having an additional sum function. Steven
has pointed out that neither of sum/fsum is accurate for all stdlib
numeric types as is the intention for the statistics module. It is not
possible to modify either of sum/fsum in a backward compatible way
that would make them suitable here.

2) The initial names for the median functions were median.low
median.high etc. This naming scheme was considered non-standard by
some and has been redesigned as median_low, median_high etc. (there
was also discussion about the method used to attach the names to the
median function but this became irrelevant after the rename).

3) The mode function also provided an algorithm for estimating the
mode of a continuous probability distribution from a sample. It was
suggested that there is no uniquely good way of doing this and that it
is not commonly needed. This was removed and the API for mode() was
simplified (it now returns a unique mode or raises an error).

4) Some of the functions (e.g. variance) used different algorithms
(and produced different results) when given an iterator instead of a
collection. These are now changed to always use the same algorithm and
build a collection internally if necessary.

5) It was suggested that it should also be possible to compute the
mean of e.g. timedelta objects but it was pointed out that they can be
converted to numbers with the timedelta.total_seconds() method.

6) I raised an issue about the way the sum function behaved for
decimals but this was changed in a subsequent patch presenting a new
sum function that isn't susceptible to accumulated rounding errors
with Decimals.


Oscar


More information about the Python-Dev mailing list