[Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin]
Tim Peters
tim.peters at gmail.com
Sat Apr 7 23:17:07 EDT 2018
[Nick Coghlan <ncoghlan at gmail.com>]
> I didn't have a strong opinion either way until Tim mentioned sum()
> and then I went and checked the docs for both that and for accumulate.
>
> First sentence of the sum() docs:
>
> Sums *start* and the items of an *iterable* from left to right and
> returns the total.
>
> First sentence of the accumulate docs:
>
> Make an iterator that returns accumulated sums, ...
>
> So I now think that having "start" as a parameter to one but not the
> other, counts as a genuine API discrepancy.
Genuine but minor ;-)
> Providing start to accumulate would then mean the same thing as
> providing it to sum(): it would change the basis point for the first
> addition operation, but it wouldn't change the *number* of cumulative
> sums produced.
That makes no sense to me. `sum()` with a `start` argument always
returns a single result, even if the iterable is empty.
>>> sum([], 42)
42
As the example shows, it's possible that `sum()` does no additions
whatsoever. It would be exceedingly bizarre if the same stuff passed
to `accumulate()` returned an empty iterator instead:
>>> list(accumulate([], start=42))
[]
It should return [42].
It seems obvious to me that a sane implementation would maintain the invariant:
sum(xs, s) == list(accumulate(xs, start=s))[-1]
and there's nothing inherently special about `xs` being empty.
It seems also obviously desirable that
accumulate(xs, start=s)
generate the same results as
accumulate(chain([s], xs))
That's obviously desirable because it's _so_ obvious that Raymond
implicitly assumed that's how it would work in his first message ;-)
Or think of it this way: if you're adding N numbers, there are N-1
additions, and N partial sums. Whether it's `sum(xs)` or
`accumulate(xs)`, if len(xs)==K then specifying `start` too changes
the number of addends from K to K+1.
> By contrast, using the prepend() approach with accumulate() not only
> changes the starting value, it also changes the number of cumulative
> sums produced.
As it should :-)
Note that that in the "real life" example code I gave, it was
essential that `accumulate()` with `start` yield the starting value
first. There were three uses in Will Ness's wheel sieve code, two of
which wanted the starting value on its own, and the last of which
didn't. In that last case, it was just a matter of doing
next(wheel)
on its own to discard the (in that specific case) unwanted starting
value. If you have to paste the starting value _in_ instead (when it
is wanted), then we're reintroducing a need for the "chain a singleton
list with the iterator" hack introducing `start=` is trying to
eliminate.
More information about the Python-ideas
mailing list