[Python-ideas] statistics module in Python3.4

Wolfgang Maier wolfgang.maier at biologie.uni-freiburg.de
Sat Feb 1 22:24:58 CET 2014


Oscar Benjamin <oscar.j.benjamin at ...> writes:
> 
> On 1 February 2014 20:10, Wolfgang Maier
> <wolfgang.maier at ...> wrote:
> > Greg Ewing <greg.ewing <at> ...> writes:
> >
> >>
> >> Wolfgang Maier wrote:
> >> > Mappings may be an excellent way of specifying frequencies and
weights in an
> >> > elegant way.
> >>
> >> That may be, but I think I'd rather have a separate
> >> function, or a mode argument, explicitly indicating
> >> that this is what you want. Detecting whether something
> >> is a mapping in a duck-typed way is dodgy in general.
> >>
> >
> > There should be nothing dodgy about this with the abstract base class
Mapping.
> 
> I agree with Greg about this. I dislike APIs that try to be too clever
> about accepting different types of input. The mode() function is
> clearly intended to accept an iterable not a Counter and I consider it
> a bug that it does. It can be fixed to treat a Counter as an iterable
> by changing the _counts function to do
> 
>     collections.Counter(data)
> 
> to
> 
>     collections.Counter(iter(data))
> 
> If you want the option for doing statistics with Counter style data
> formats then they should be invoked explicitly as Greg says.
> 

I would accept this as a bug-fix, but I do not agree with you and Greg about
an API 
trying to be too clever here. A Mapping is a clearly defined term and if the
module 
doc stated that for Mappings passed to functions in statistics their values
will be 
interpreted as weights/frequencies of the corresponding keys that should be 
clear enough. If what you really want is to do statistics just on the keys,
you can 
easily pass just these (e.g., mean(mydict.keys()).
On the other hand, separate functions would complicate the API because you 
would end up with a weighted counterpart for almost every function in the 
module. The alternative of passing weights in the form of a second sequence 
sounds more attractive, but I guess both specifications could coexist
peacefully 
(just like there are several ways of constructing a dictionary). Then if you
have 
data and weights in a Mapping already, you can just go ahead and use it, and

likewise, if you have them in two separate sequences.

Best,
Wolfgang



More information about the Python-ideas mailing list