[Python-Dev] Accumulation module
Raymond Hettinger
raymond.hettinger at verizon.net
Tue Jan 13 14:51:56 EST 2004
I'm working a module with some accumulation/reduction/statistical
formulas:
average(iterable):
stddev(iterable, sample=False)
product(iterable)
nlargest(iterable, n=1)
nsmallest(iterable, n=1)
The questions that have arisen so far are:
* What to call the module
* What else should be in it?
* Is "sample" a good keyword to distinguish from population stddev?
* There seems to be a speed/space choice on how to implement
nlargest/nsmallest. The faster way lists out the entire iterable,
heapifies it, and pops off the top n elements. The slower way is less
memory intensive: build only a n-length heap and then just do a
heapreplace when necessary. Note, heapq is used for both (I use
operator.neg to swap between largest and smallest).
* Is there a way to compute the standard deviation without multiple
passes over the data (one to compute the mean and a second to sum the
squares of the differences from the mean?
Raymond Hettinger
More information about the Python-Dev
mailing list