[Python-ideas] Pre-PEP: adding a statistics module to Python

Oscar Benjamin oscar.j.benjamin at gmail.com
Wed Aug 7 13:14:31 CEST 2013


On Aug 7, 2013 2:35 AM, "Steven D'Aprano" <steve at pearwood.info> wrote:
>
> On 07/08/13 01:49, Oscar Benjamin wrote:
>
>> Taking the example from the PEP:
>>
>>>>> from statistics import *
>>>>> data = [1, 2, 4, 5, 8]
>>>>> data = [x+1e12 for x in data]
>>>>> variance(data)
>>
>> 7.5
>>
>> However:
>>
>>>>> variance(iter(data))
>>
>> 7.4999542236328125
>>
>> Okay so that's a small difference and it's unlikely to upset many
>> people. But being something of a numerical obsessive I do often get
>> upset about things like this. It's not that I mind the size of the
>> error but rather that I dislike having the calculation implicitly
>> changed. I want to think that it doesn't matter whether I pass an
>> iterator or a list because either I get an error or I get the same
>> result.
>
>
> That's fantastic feedback and exactly the sort of thing I want to hear :-)
>
> This is mentioned under "Design Decisions" in the PEP, and treated as a
feature, but I'm open to revising that behaviour. 3.4 feature-freeze is
quite close, and I don't want to hold up acceptance of the PEP (which
doesn't even have a number yet!) for one-pass stats calculations. So I'm
going to take this approach:
>
> - The difference between variance(list(data)) and variance(iter(data)) is
an artifact of implementation, not a feature, so is subject to change.
>
> - I doubt I will reject iterators, but I may internally convert them to
lists (median already does this).
>
> - For the time being, all documentation examples will only show lists
being used.
>
> - I will defer for 3.5 a set of one-pass functions that return running
statistics (I already have code for coroutines to do this, but they're not
ready for the std lib).

Sounds good to me!

Oscar
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130807/1f1af712/attachment.html>


More information about the Python-ideas mailing list