[Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin]

Raymond Hettinger raymond.hettinger at gmail.com
Sun Apr 8 17:35:47 EDT 2018


> On Apr 8, 2018, at 12:22 PM, Tim Peters <tim.peters at gmail.com> wrote:
> 
> [Guido]
>> Well if you can get Raymond to agree on that too I suppose you can go ahead.
>> Personally I'm -0 but I don't really write this kind of algorithmic code
>> enough to know what's useful.
> 
> Actually, you do - but you don't _think_ of problems in these terms.
> Neither do I.  For those who do:  consider any program that has state
> and responds to inputs.  When you get a new input, the new state is a
> function of the existing state and the input.

The Bayesian world view isn't much different except they would prefer "prior" instead of "initial" or "start" ;-)

    my_changing_beliefs = accumulate(stream_of_new_evidence, bayes_rule, prior=what_i_used_to_think)

Though the two analogies are cute, I'm not sure they tell us much.  In running programs or bayesian analysis, we care more about the result rather than the accumulation of intermediate results.

My own experience with actually using accumulations in algorithmic code falls neatly into two groups.  Many years ago, I used APL extensively in accounting work and my recollection is that a part of the convenience of "\+" was that the sequence length didn't change (so that the various data arrays could interoperate with one another).  

My other common case for accumulate() is building cumulative probability distributions from probability mass functions (see the code for random.choice() for example, or typical code for a K-S test).

For neither of those use case categories did I ever want an initial value and it would have been distracting to even had the option. For example, when doing a discounted cash flow analysis, I was taught to model the various flows as a single sequence of up and down arrows rather than thinking of the initial balance as a distinct concept¹

Because of this background, I was surprised to have the question ever come up at all (other than the symmetry argument that sum() has "start" so accumulate() must as well).

When writing itertools.accumulate(), I started by looking to see what other languages had done.  Since accumulate() is primarily a numerical tool, I expected that the experience of numeric-centric languages would have something to teach us.  My reasoning was that if the need hadn't arisen for APL, R, Numpy, Matlab², or Mathematica, perhaps it really was just noise.

My views may be dated though.  Looking at the wheel sieve and collatz glide record finder, I see something new, a desire to work with lazy, potentially infinite accumulations (something that iterators do well but almost never arises in the world of fixed-length sequences or cumulative probability distributions).

So I had been warming up to the idea, but got concerned that Nick could have had such a profoundly different idea about what the code should do.  That cooled my interest a bit, especially when thinking about two key questions, "Will it create more problems than it solves?" and "Will anyone actually use it?".



Raymond







¹ http://www.chegg.com/homework-help/questions-and-answers/solve-present-worth-cash-flow-shown-using-three-interest-factors-10-interest-compounded-an-q878034

² https://www.mathworks.com/help/matlab/ref/accumarray.html


More information about the Python-ideas mailing list