[Python-Dev] Accumulation module
Raymond Hettinger
python at rcn.com
Wed Jan 14 03:24:41 EST 2004
> > * What to call the module
[Aahz]
> stats
There is already a stat module. Any chance of confusion?
The other naming issue is that some of the functions have
non-statistical uses: product() is general purpose; nlargest() and
nsmallest() will accept any datatype (though most of the use cases are
with numbers). Are there other general purpose (non-statistical)
accumulation/reduction formulas that go here?
> > * What else should be in it?
[Matthias Klose]
> you may want to have a look at
> http://www.nmr.mgh.harvard.edu/Neural_Systems_Group/gary/python.html
Ages ago, when the idea for this module first arose, a certain bot
recommended strongly against including any but the most basic
statistical functions (muttering something about the near impossibility
of doing it well in either python or portable C and something about not
wanting to maintain anything that wasn't dirt simple). His words would
have of course fallen on deaf ears, but a certain dictatorial type had
just finished teaching advanced programming skills to people who
couldn't operate a high school calculator. Sooooo, no Kurtosis for you,
no gamma function for me!
It's possible that chi-square or regression could slip in, but it would
require considerable cheerleading and a rare planetary alignment.
> > * What else should be in it?
[Jeremy]
> median()
> And a function like bins() or histogram() that accumulates
> the values in buckets of some size.
That sounds beginner simple and reasonably useful though it would have
been nice if all the reduction formulas could work with one-pass and
never need to manifest the whole dataset in memory.
> > Note, heapq is used for both (I use
> > operator.neg to swap between largest and smallest).
[Bernhard Herzog]
> Does that mean nlargest/nsmallest only work for numbers? I think it
> might be useful for e.g. strings too.
The plan was to make them work with anything defining __lt__; however,
if it is coded in python and uses heapq, I don't see a straight-forward
way around using operator.neg without wrapping everything in some sense
reverser object.
Raymond Hettinger
More information about the Python-Dev
mailing list