Given that two respected members of the community so strongly disagree whether accumulate([], start=0) should behave like accumulate([]) or like accumulate([0]), maybe in the end it's better not to add a start argument. (The disagreement suggests that we can't trust users' intuition here.)

On Sat, Apr 7, 2018 at 9:14 PM, Nick Coghlan wrote:
On 8 April 2018 at 13:17, Tim Peters <tim.peters@gmail.com> wrote:
> [Nick Coghlan <ncoghlan@gmail.com>]
>> So I now think that having "start" as a parameter to one but not the
>> other, counts as a genuine API discrepancy.
>
> Genuine but minor ;-)

Agreed :)

>> Providing start to accumulate would then mean the same thing as
>> providing it to sum(): it would change the basis point for the first
>> addition operation, but it wouldn't change the *number* of cumulative
>> sums produced.
>
> That makes no sense to me.  `sum()` with a `start` argument always
> returns a single result, even if the iterable is empty.
>
>>>> sum([], 42)
> 42

Right, but if itertools.accumulate() had the semantics of starting
with a sum() over an empty iterable, then it would always start with
an initial zero.

It doesn't - it starts with "0+first_item", so the length of the
output iterator matches the number of items in the input iterable:

>>> list(accumulate([]))
[]
>>> list(accumulate([1, 2, 3, 4]))
[1, 3, 6, 10]

That matches the output you'd get from a naive O(n^2) implementation
of cumulative sums:

data = list(iterable)
for stop in range(1, len(iterable)):
yield sum(data[:stop])

So if the new parameter were to be called start, then I'd expect the
semantics to be equivalent to:

data = list(iterable)
for stop in range(1, len(iterable)):
yield sum(data[:stop], start=start)

rather than the version Raymond posted at the top of the thread (where
setting start explicitly also implicitly increases the number of items
produced).

That concern mostly goes away if the new parameter is deliberately
called something *other than* "start" (e.g. "prepend=value", or
"first=value"), but it could also be addressed by offering a dedicated
"yield_start" toggle, such that the revised semantics were:

it = iter(iterable)
total = start
if yield_start:
yield total
for element in it:
total = func(total, element)
yield total

That approach would have the advantage of making the default value of
"start" much easier to document (since it would just be zero, the same
as it is for sum()), and only the length of the input iterable and
"yield_start" would affect how many partial sums were produced.

Cheers,
Nick.

--
Nick Coghlan   |   ncoghlan@gmail.com   |   Brisbane, Australia
_______________________________________________
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

--
--Guido van Rossum (python.org/~guido)