[Python-Dev] Accumulation module

Tue Jan 13 14:51:56 EST 2004

I'm working a module with some accumulation/reduction/statistical
formulas:

average(iterable):
stddev(iterable, sample=False)
product(iterable)
nlargest(iterable, n=1)
nsmallest(iterable, n=1)

The questions that have arisen so far are:

* What to call the module

* What else should be in it?

* Is "sample" a good keyword to distinguish from population stddev?

* There seems to be a speed/space choice on how to implement
nlargest/nsmallest.  The faster way lists out the entire iterable,
heapifies it, and pops off the top n elements.  The slower way is less
memory intensive:  build only a n-length heap and then just do a
heapreplace when necessary.  Note, heapq is used for both (I use
operator.neg to swap between largest and smallest).

* Is there a way to compute the standard deviation without multiple
passes over the data (one to compute the mean and a second to sum the
squares of the differences from the mean?

Raymond Hettinger