Ok, so it seems everyone's happy with adding an initial_value argument.  

Now, I claim that while it should be an option, the initial value should NOT be returned by default.  (i.e. the returned generator should by default yield N elements, not N+1).  

Example: suppose we're doing the toll booth thing, and we want to yield a cumulative sum of tolls so far.  Suppose someone already made a reasonable-looking generator yielding the cumulative sum of tolls for today:

def iter_cumsum_tolls_from_day(day, toll_amount_so_far):
    return accumulate(get_tolls_from_day(day, initial=toll_amount_so_far))

And now we want to make a way to get all tolls from the month.  One might reasonably expect this to work:

def iter_cumsum_tolls_from_month(month, toll_amount_so_far):
    for day in month:
        for cumsum_tolls in iter_cumsum_tolls_from_day(day, toll_amount_so_fartoll_amount_so_far):
            yield cumsum_tolls
        toll_amount_so_far = cumsum_tolls

But this would actually DUPLICATE the last toll of every day - it appears both as the last element of the day's generator and as the first element of the next day's generator.  

This is why I think that there should be an additional "include_initial_in_return=False" argument.  I do agree that it should be an option to include the initial value (your "find tolls over time-span" example shows why), but that if you want that you should have to show that you thought about that by specifying "include_initial_in_return=True"





On Mon, Apr 9, 2018 at 10:30 PM, Tim Peters <tim.peters@gmail.com> wrote:
[Tim]
>> while we have N numbers, there are N+1 slice indices.  So
>> accumulate(xs) doesn't quite work.  It needs to also have a 0 inserted
>> as the first prefix sum (the empty prefix sum(xs[:0]).
>>
>> Which is exactly what a this_is_the_initial_value=0 argument would do
>> for us.

[Greg Ewing <greg.ewing@canterbury.ac.nz>]
> In this case, yes. But that still doesn't mean it makes
> sense to require the initial value to be passed *in* as
> part of the input sequence.
>
> Maybe the best idea is for the initial value to be a
> separate argument, but be returned as the first item in
> the list.

I'm not sure you've read all the messages in this thread, but that's
exactly what's being proposed.  That. e.g., a new optional argument:

    accumulate(xs, func, initial=S)

act like the current

     accumulate(chain([S], xs), func)

Note that in neither case is the original `xs` modified in any way,
and in both cases the first value generated is S.

Note too that the proposal is exactly the way Haskell's `scanl` works
(although `scanl` always requires specifying an initial value - while
the related `scanl1` doesn't allow specifying one).

And that's all been so since the thread's first message, in which
Raymond gave a proposed implementation:

        _sentinel = object()

        def accumulate(iterable, func=operator.add, start=_sentinel):
            it = iter(iterable)
            if start is _sentinel:
                try:
                    total = next(it)
                except StopIteration:
                    return
            else:
                total = start
            yield total
            for element in it:
                total = func(total, element)
                yield total

> I can think of another example where this would make
> sense. Suppose you have an initial bank balance and a
> list of transactions, and you want to produce a statement
> with a list of running balances.
>
> The initial balance and the list of transactions are
> coming from different places, so the most natural way
> to call it would be
>
>    result = accumulate(transactions, initial = initial_balance)
>
> If the initial value is returned as item 0, then the
> result has the following properties:
>
>    result[0] is the balance brought forward
>    result[-1] is the current balance
>
> and this remains true in the corner case where there are
> no transactions.

Indeed, something quite similar often applies when parallelizing
search loops of the form:

     for candidate in accumulate(chain([starting_value], cycle(deltas))):

For a sequence that eventually becomes periodic in the sequence of
deltas it cycles through, multiple processes can run independent
searches starting at carefully chosen different starting values "far"
apart.  In effect, they're each a "balance brought forward" pretending
that previous chunks have already been done.

Funny:  it's been weeks now since I wrote an accumulate() that
_didn't_ want to specify a starting value - LOL ;-)
_______________________________________________
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/