[Python-ideas] Pre-PEP: adding a statistics module to Python

Tue Aug 6 18:15:15 CEST 2013

On 6 August 2013 16:49, Oscar Benjamin <oscar.j.benjamin at gmail.com> wrote:
> Really I think that the use-cases are basically like this:
>
> 1) You can just put the data in a collection in memory (the common case).
> 2) Your data is too large to go in memory but you can iterate over it
> from the disk, or network, or a computational generator or whatever.
> Since the iteration is expensive or unrepeatable you want to compute
> everything in one pass (happens sometimes but certainly a lot less
> common than case 1)).
> 3) Your data/computation is distributed and you want to compute
> statistics in a distributed/parallel framework and merge them later (a
> very specialised setup that possibly warrants having its own
> implementation of the statistical routines anyway).

4) You want to be able to save/reload state midway through computing
statistics and get intermediate results. This could be e.g. a script
that periodically runs and collates data from log-files.

Oscar