[Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin]

Sun Apr 8 00:31:09 EDT 2018

Given that two respected members of the community so strongly disagree
whether accumulate([], start=0) should behave like accumulate([]) or like
accumulate([0]), maybe in the end it's better not to add a start argument.
(The disagreement suggests that we can't trust users' intuition here.)

On Sat, Apr 7, 2018 at 9:14 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:

> On 8 April 2018 at 13:17, Tim Peters <tim.peters at gmail.com> wrote:
> > [Nick Coghlan <ncoghlan at gmail.com>]
> >> So I now think that having "start" as a parameter to one but not the
> >> other, counts as a genuine API discrepancy.
> >
> > Genuine but minor ;-)
>
> Agreed :)
>
> >> Providing start to accumulate would then mean the same thing as
> >> providing it to sum(): it would change the basis point for the first
> >> addition operation, but it wouldn't change the *number* of cumulative
> >> sums produced.
> >
> > That makes no sense to me.  `sum()` with a `start` argument always
> > returns a single result, even if the iterable is empty.
> >
> >>>> sum([], 42)
> > 42
>
> Right, but if itertools.accumulate() had the semantics of starting
> with a sum() over an empty iterable, then it would always start with
> an initial zero.
>
> It doesn't - it starts with "0+first_item", so the length of the
> output iterator matches the number of items in the input iterable:
>
>     >>> list(accumulate([]))
>     []
>     >>> list(accumulate([1, 2, 3, 4]))
>     [1, 3, 6, 10]
>
> That matches the output you'd get from a naive O(n^2) implementation
> of cumulative sums:
>
>     data = list(iterable)
>     for stop in range(1, len(iterable)):
>         yield sum(data[:stop])
>
> So if the new parameter were to be called start, then I'd expect the
> semantics to be equivalent to:
>
>     data = list(iterable)
>     for stop in range(1, len(iterable)):
>         yield sum(data[:stop], start=start)
>
> rather than the version Raymond posted at the top of the thread (where
> setting start explicitly also implicitly increases the number of items
> produced).
>
> That concern mostly goes away if the new parameter is deliberately
> called something *other than* "start" (e.g. "prepend=value", or
> "first=value"), but it could also be addressed by offering a dedicated
> "yield_start" toggle, such that the revised semantics were:
>
>         def accumulate(iterable, func=operator.add, start=0,
> yield_start=False):
>             it = iter(iterable)
>             total = start
>             if yield_start:
>                 yield total
>             for element in it:
>                 total = func(total, element)
>                 yield total
>
> That approach would have the advantage of making the default value of
> "start" much easier to document (since it would just be zero, the same
> as it is for sum()), and only the length of the input iterable and
> "yield_start" would affect how many partial sums were produced.
>
> Cheers,
> Nick.
>
> --
> Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20180407/e3ff5629/attachment-0001.html>