[Python-ideas] Pre-PEP: adding a statistics module to Python

Andres Osinski andres.osinski at gmail.com
Sun Aug 4 07:48:57 CEST 2013


Could not care less so long as there is consistency.


On Sun, Aug 4, 2013 at 12:25 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:

>
> On 4 Aug 2013 12:44, "Joshua Landau" <joshua at landau.ws> wrote:
> >
> > On 4 August 2013 03:00, Eli Bendersky <eliben at gmail.com> wrote:
> >>
> >> On Sat, Aug 3, 2013 at 12:47 PM, Alexander Belopolsky
> >> <alexander.belopolsky at gmail.com> wrote:
> >> >
> >> > On Fri, Aug 2, 2013 at 1:45 PM, Steven D'Aprano <steve at pearwood.info>
> wrote:
> >> >>
> >> >> I have raised an issue on the tracker to add a statistics module to
> >> >> Python's standard library:
> >> >>
> >> >> http://bugs.python.org/issue18606
> >> >>
> >> >> and have been asked to write a PEP. Attached is my draft PEP.
> Feedback is
> >> >> requested, thanks in advance.
> >> >
> >> >
> >> > The PEP does not mention statistics.sum(), but the reference
> implementation
> >> > includes it.  I am not sure stdlib needs the third sum function after
> >> > builtins.sum and math.fsum.  I think it will be better to improve
> >> > builtins.sum instead.
> >>
> >> While I'm somewhat -0.5 on the general idea of the statistics module
> >> (competing with well-established, super-optimized and
> >> by-themselves-famous numeric libraries Python has does not sound like
> >> a worthy goal),
> >
> >
> > I don't believe it is, in the general case. This is for those cases
> where you might go only with reluctance with numpy, or even be forced to
> roll your own. Numpy is a beast that some people, me included, haven't need
> to learn yet statistics often come in use in a lot of algorithms. Not to
> mention the full third-second lag to import numpy ;).
> >
> >>
> >> I have to agree with Alexander w.r.t. "sum". Strongly
> >> -1 from me on having functions with the same name as existing stdlib
> >> functions but different functionality. This is very much unpythonic.
> >
> >
> > I don't agree that this is a segregation that has to happen, but I agree
> that it's not something that stdlib does AFAIK. I think that's a tradition
> worth keeping. Additionally it's not immediately obvious to any newcomer
> why statistics.sum is implemented differently to builtins.sum - this should
> be made evident from the name (akin to fsum).
> >
> > statistics.sum is a statistical sum of numeric data optimised to be
> correct. builtins.sum is, as far as the user can tell, just iterated
> addition. They both have their place but they're different places and it
> should be more immediately obvious where.
> >
> > Finally -- do we need math.fsum¹ if we have statistics.sum?
>
> Right, statistics.sum should be seen as a more obvious replacement for
> math.fsum, rather than replacing the builtin sum. (However, it may make
> sense for statistics.sum to use math.fsum internally).
>
> A pre-emptive FAQ answer may also be appropriate.
>
> Cheers,
> Nick.
>
> >
> > ¹ I just noticed fsum says "a float is required" when given invalid data
> despite accepting generic numerics.
> >
> > _______________________________________________
> > Python-ideas mailing list
> > Python-ideas at python.org
> > http://mail.python.org/mailman/listinfo/python-ideas
> >
>
>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> http://mail.python.org/mailman/listinfo/python-ideas
>
>


-- 
Andrés Osinski
http://www.andresosinski.com.ar/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20130804/215a6dfb/attachment.html>


More information about the Python-ideas mailing list