[Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin]
Tim Peters
tim.peters at gmail.com
Mon Apr 9 20:15:35 EDT 2018
Woo hoo! Another coincidence. I just happened to be playing with
this problem today:
You have a large list - xs - of N numbers. It's necessary to compute slice sums
sum(xs[i:j])
for a great many slices, 0 <= i <= j <= N.
For concreteness, say xs is a time series representing a toll booth
receipt's by hour across years. "Management" may ask for all sorts of
sums - by 24-hour period, by week, by month, by year, by season, ...
A little thought showed that sum(xs[i:j]) = sum(xs[:j]) - sum(xs[:i]),
so if we precomputed just the prefix sums, the sum across an arbitrary
slice could be computed thereafter in constant time. Hard to beat
that ;-)
But computing the prefix sums is exactly what accumulate() does! With
one twist: while we have N numbers, there are N+1 slice indices. So
accumulate(xs) doesn't quite work. It needs to also have a 0 inserted
as the first prefix sum (the empty prefix sum(xs[:0]).
Which is exactly what a this_is_the_initial_value=0 argument would do
for us. As is, using the chain trick:
class SliceSummer:
def __init__(self, xs):
from itertools import accumulate, chain
self.N = N = len(xs)
if not N:
raise ValueError("need a non-empty sequence")
self.prefixsum = list(accumulate(chain([0], xs)))
assert len(self.prefixsum) == N+1
def slicesum(self, i, j):
N = self.N
if not 0 <= i <= j <= N:
raise ValueError(f"need 0 <= {i} <= {j} <= {N}")
return self.prefixsum[j] - self.prefixsum[i]
def test(N):
from random import randrange
xs = [randrange(-10, 11) for _ in range(N)]
ntried = 0
ss = SliceSummer(xs)
NP1 = N + 1
for i in range(NP1):
for j in range(i, NP1):
ntried += 1
assert ss.slicesum(i, j) == sum(xs[i:j])
assert ntried == N * NP1 // 2 + NP1, ntried
More information about the Python-ideas
mailing list