[Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin]

Mon Apr 9 20:15:35 EDT 2018

Woo hoo!  Another coincidence.  I just happened to be playing with
this problem today:

You have a large list - xs - of N numbers.  It's necessary to compute slice sums

    sum(xs[i:j])

for a great many slices, 0 <= i <= j <= N.

For concreteness, say xs is a time series representing a toll booth
receipt's by hour across years.  "Management" may ask for all sorts of
sums - by 24-hour period, by week, by month, by year, by season, ...

A little thought showed that sum(xs[i:j]) = sum(xs[:j]) - sum(xs[:i]),
so if we precomputed just the prefix sums, the sum across an arbitrary
slice could be computed thereafter in constant time.  Hard to beat
that ;-)

But computing the prefix sums is exactly what accumulate() does!  With
one twist:  while we have N numbers, there are N+1 slice indices.  So
accumulate(xs) doesn't quite work.  It needs to also have a 0 inserted
as the first prefix sum (the empty prefix sum(xs[:0]).

Which is exactly what a this_is_the_initial_value=0 argument would do
for us.  As is, using the chain trick:

    class SliceSummer:
        def __init__(self, xs):
            from itertools import accumulate, chain
            self.N = N = len(xs)
            if not N:
                raise ValueError("need a non-empty sequence")
            self.prefixsum = list(accumulate(chain([0], xs)))
            assert len(self.prefixsum) == N+1

        def slicesum(self, i, j):
            N = self.N
            if not 0 <= i <= j <= N:
                raise ValueError(f"need 0 <= {i} <= {j} <= {N}")
            return self.prefixsum[j] - self.prefixsum[i]

    def test(N):
        from random import randrange
        xs = [randrange(-10, 11) for _ in range(N)]
        ntried = 0
        ss = SliceSummer(xs)
        NP1 = N + 1
        for i in range(NP1):
            for j in range(i, NP1):
                ntried += 1
                assert ss.slicesum(i, j) == sum(xs[i:j])
        assert ntried == N * NP1 // 2 + NP1, ntried