[docs] Make the documentation for statistics' data argument clearer. (issue 27825)

steve+python at pearwood.info steve+python at pearwood.info
Fri Oct 7 20:20:06 EDT 2016


Thanks for the patch Mariatta. The addition of "a sequence or iterator"
is good.

I dislike the median_grouped() example and would like to see one which
uses more plausible data values.


http://bugs.python.org/review/27825/diff/18782/Doc/library/statistics.rst
File Doc/library/statistics.rst (right):

http://bugs.python.org/review/27825/diff/18782/Doc/library/statistics.rst#newcode241
Doc/library/statistics.rst:241: >>> median_grouped([F(55, 7), F(52, 3),
F(53, 3), F(5, 3)])
I dislike this example as it is misleading.

median_grouped() will work with arbitrary numbers, as in this example,
but the intention is that the numbers are the mid-point of data classes,
often called "class marks". So you should expect that the data are
equally-spaced values (median_grouped assumes a class interval of 1 by
default, but it can be changed). Examples should be equally-spaced
(possibly with gaps -- it is okay if a class has no values).

Although median_grouped() will work with Fractions, I think it is
unlikely that anyone would give it actual data in Fractions. I'd prefer
to see more realistic examples: ints, floats or Decimals.

Perhaps the documentation needs to define the relevant statistics
terminology? "Class interval", "class mark", "class limits". See:

http://mathworld.wolfram.com/ClassMark.html

http://bugs.python.org/review/27825/


More information about the docs mailing list