[Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin]

Peter O'Connor peter.ed.oconnor at gmail.com
Tue Apr 10 00:11:52 EDT 2018


* correction to brackets from first example:

def iter_cumsum_tolls_from_day(day, toll_amount_so_far):
    return accumulate(get_tolls_from_day(day), initial=toll_amount_so_far)


On Mon, Apr 9, 2018 at 11:55 PM, Peter O'Connor <peter.ed.oconnor at gmail.com>
wrote:

> Ok, so it seems everyone's happy with adding an initial_value argument.
>
> Now, I claim that while it should be an option, the initial value should
> NOT be returned by default.  (i.e. the returned generator should by default
> yield N elements, not N+1).
>
> Example: suppose we're doing the toll booth thing, and we want to yield a
> cumulative sum of tolls so far.  Suppose someone already made a
> reasonable-looking generator yielding the cumulative sum of tolls for today:
>
> def iter_cumsum_tolls_from_day(day, toll_amount_so_far):
>     return accumulate(get_tolls_from_day(day, initial=toll_amount_so_far))
>
> And now we want to make a way to get all tolls from the month.  One might
> reasonably expect this to work:
>
> def iter_cumsum_tolls_from_month(month, toll_amount_so_far):
>     for day in month:
>         for cumsum_tolls in iter_cumsum_tolls_from_day(day,
> toll_amount_so_far = toll_amount_so_far):
>             yield cumsum_tolls
>         toll_amount_so_far = cumsum_tolls
>
> But this would actually DUPLICATE the last toll of every day - it appears
> both as the last element of the day's generator and as the first element of
> the next day's generator.
>
> This is why I think that there should be an additional "
> include_initial_in_return=False" argument.  I do agree that it should be
> an option to include the initial value (your "find tolls over time-span"
> example shows why), but that if you want that you should have to show that
> you thought about that by specifying "include_initial_in_return=True"
>
>
>
>
>
> On Mon, Apr 9, 2018 at 10:30 PM, Tim Peters <tim.peters at gmail.com> wrote:
>
>> [Tim]
>> >> while we have N numbers, there are N+1 slice indices.  So
>> >> accumulate(xs) doesn't quite work.  It needs to also have a 0 inserted
>> >> as the first prefix sum (the empty prefix sum(xs[:0]).
>> >>
>> >> Which is exactly what a this_is_the_initial_value=0 argument would do
>> >> for us.
>>
>> [Greg Ewing <greg.ewing at canterbury.ac.nz>]
>> > In this case, yes. But that still doesn't mean it makes
>> > sense to require the initial value to be passed *in* as
>> > part of the input sequence.
>> >
>> > Maybe the best idea is for the initial value to be a
>> > separate argument, but be returned as the first item in
>> > the list.
>>
>> I'm not sure you've read all the messages in this thread, but that's
>> exactly what's being proposed.  That. e.g., a new optional argument:
>>
>>     accumulate(xs, func, initial=S)
>>
>> act like the current
>>
>>      accumulate(chain([S], xs), func)
>>
>> Note that in neither case is the original `xs` modified in any way,
>> and in both cases the first value generated is S.
>>
>> Note too that the proposal is exactly the way Haskell's `scanl` works
>> (although `scanl` always requires specifying an initial value - while
>> the related `scanl1` doesn't allow specifying one).
>>
>> And that's all been so since the thread's first message, in which
>> Raymond gave a proposed implementation:
>>
>>         _sentinel = object()
>>
>>         def accumulate(iterable, func=operator.add, start=_sentinel):
>>             it = iter(iterable)
>>             if start is _sentinel:
>>                 try:
>>                     total = next(it)
>>                 except StopIteration:
>>                     return
>>             else:
>>                 total = start
>>             yield total
>>             for element in it:
>>                 total = func(total, element)
>>                 yield total
>>
>> > I can think of another example where this would make
>> > sense. Suppose you have an initial bank balance and a
>> > list of transactions, and you want to produce a statement
>> > with a list of running balances.
>> >
>> > The initial balance and the list of transactions are
>> > coming from different places, so the most natural way
>> > to call it would be
>> >
>> >    result = accumulate(transactions, initial = initial_balance)
>> >
>> > If the initial value is returned as item 0, then the
>> > result has the following properties:
>> >
>> >    result[0] is the balance brought forward
>> >    result[-1] is the current balance
>> >
>> > and this remains true in the corner case where there are
>> > no transactions.
>>
>> Indeed, something quite similar often applies when parallelizing
>> search loops of the form:
>>
>>      for candidate in accumulate(chain([starting_value], cycle(deltas))):
>>
>> For a sequence that eventually becomes periodic in the sequence of
>> deltas it cycles through, multiple processes can run independent
>> searches starting at carefully chosen different starting values "far"
>> apart.  In effect, they're each a "balance brought forward" pretending
>> that previous chunks have already been done.
>>
>> Funny:  it's been weeks now since I wrote an accumulate() that
>> _didn't_ want to specify a starting value - LOL ;-)
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> https://mail.python.org/mailman/listinfo/python-ideas
>> Code of Conduct: http://python.org/psf/codeofconduct/
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20180410/21dbbc25/attachment.html>


More information about the Python-ideas mailing list