[issue20479] Efficiently support weight/frequency mappings in the statistics module

Oscar Benjamin report at bugs.python.org
Mon Feb 3 11:33:54 CET 2014


Oscar Benjamin added the comment:

> in my previous message. To support weights (float or Rational) this would have to be more sophisticated.

I guess you'd do:

     for x,w in data.items():
         T = _coerce_types(T, type(x))
         xn, xd = exact_ratio(x)
         wn, wd = exact_ratio(w)
         partials[d] = partials_get(xd * wd, 0) + xn * wn

Variance is only slightly trickier. Median would be more complicated.

I just think that I prefer to know when I look at code that something is being
treated as a mapping or as an iterable. So when I look at

    d = f(x, y, z)
    v = variance_map(d)

It's immediately obvious what d is and how the function variance_map is using
it.

As well as the benefit of readability there's also the fact that accepting
different kinds of input puts strain on any attempt to modify your code in the
future. Auditing the code requires understanding at all times that the name
"data" is bound to a quantum superposition of different types of object.

Either every function would have to have the same "iterable or mapping"
interface or there would have to be some other convention for making it clear
which ones do. Perhaps the functions that don't make sense for a mapping could
explicitly reject them rather than treating them as an iterable.

I just think it's simpler to have a different function name for each type of
input. Then it's clear what functions are available for working with mappings.

If you were going for something completely different then you could have an
object-oriented interface where there are classes for the different types of
data and methods that do the right thing in each case.

Then you would do

    v = WeightedData(d).variance()

The ordinary variance() function could just become a shortcut for

    def variance(data):
        return SequenceData(data).variance()

----------

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue20479>
_______________________________________


More information about the Python-bugs-list mailing list