[Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin]
Tim Peters
tim.peters at gmail.com
Sat Apr 7 18:09:36 EDT 2018
...
[Tim]
>> Later:
>>
>> def coll(SHIFT=24):
>> ...
>> from itertools import accumulate, chain, cycle
>> ...
>> LIMIT = 1 << SHIFT
>> ...
>> abc, first, deltas = buildtab(SHIFT, LIMIT)
>> ...
>> for num in accumulate(chain([first], cycle(deltas))):
>> assert num % 3 != 2
>>
>> As in Will's code, it would be more readable as:
>>
>> for num in accumulate(cycle(deltas), start=first):
[Raymond]
> That does read better. I am curious how you would have
> written it as a plain for-loop before accumulate() was added
The original loop was quite different, a nested loop pair reflecting
directly that candidates are of the form i * LIMIT + j for i >= 1 and
j in goodix:
for base in itertools.count(LIMIT, LIMIT):
for ix in goodix:
num = base + ix
if num % 3 == 2:
continue
It was later I noticed that, across every 3 full iterations of the
outer loop, exactly one third of the "num % 3 == 2" tests were true.
It took some thought & a bit of proof to show that all and only the
num % 3 != 2 candidates could be generated directly by the shorter
code. BTW,
count(LIMIT, LIMIT)
is a bit of a head-scratcher itself ;-)
Without `accumulate()`, I suppose I would have done this instead:
num = first
for delta in chain([0], cycle(deltas)):
num += delta
That's worse to my eyes! The `chain()` trick is still needed, but in
this case to inject a 0 delta at the start so that `num` remains
`first` across the first iteration.
I should note that this is "a search loop" that rarely finds what it's
looking for. There are several places in the body that give up on the
current `num` and want to move on to the next candidate. So it's of
great pragmatic value that it be written in a way such that a plain
`continue` in the body does the right thing. For that reason, I would
_not_ have written it as, e.g.,
num = first
for delta in cycle(deltas):
# masses of tests that may want to give up
# early, excruciatingly nested so that "give up"
# falls through to the end
...
num += delta
> (part of the argument against reduce() was that a plain
> for-loop would be clearer 99% of the time).
Except that isn't true: 99% of `reduce()` instances were replaced by
`sum()` when the latter was introduced :-) "Sum reduction" and
"running-sum accumulation" are primitives in many peoples' brains. In
generalizing those to other dyadic operations, it's the abstraction
itself that's responsible for the loss of clarity - now you're
building a higher-order functional that's not a primitive in anyone's
brain except for Ken Iverson and Kirby Urner ;-)
The rest of us are better off seeing the moving pieces in a loop body.
But that's simply not so for addition, which is why introducing
`sum()` was a great idea. BTW, note that `sum()` also supports an
optional `start=` argument. I expect (but don't know) that
`accumulate()` is overwhelmingly used to do running sums (the only use
I've had for it), so it's a bit odd on that count that it doesn't.
> ...
> Agreed that the "chain([x], it)" step is obscure. That's a bit of a bummer --
> one of the goals for the itertools module was to be a generic toolkit for
> chopping-up, modifying, and splicing iterator streams (sort of a CRISPR
> for iterators).
I'd say it was overwhelmingly successful at that goal. The rub here
appears to be that `x` on its own is not a stream - it has to be
wrapped inside an iterable first to play nice with stream-based tools.
In a stream-based language (like Haskell), there's usually a "cons"
operation built in to prepend a scalar to a stream (like `x : it` in
Haskell is pretty much the same as `chain([x], it)`).
> The docs probably need another recipe to show this pattern:
>
> def prepend(value, iterator):
> "prepend(1, [2, 3, 4]) -> 1 2 3 4"
> return chain([value], iterator)
+1. Whether `accumulate()` should grow a `start=` argument still
seems a distinct (albeit related) issue to me, though.
More information about the Python-ideas
mailing list