[Python-ideas] Start argument for itertools.accumulate() [Was: Proposal: A Reduce-Map Comprehension and a "last" builtin]
Tim Peters
tim.peters at gmail.com
Sun Apr 8 01:00:22 EDT 2018
Nick, sorry, but your arguments still make little sense to me. I
think you're pushing an analogy between `sum()` details and
`accumulate()` waaaaay too far, changing a simple idea into a
needlessly complicated one.
`accumulate()` can do anything at all it wants to do with a `start`
argument (if it grows one), and a "default" of start=0 makes no sense:
unlike `sum()`, `accumulate()` is not
specifically for use with numeric values and may
reject non-numeric types [from the `sum()` docs]
`accumulate()` accepts any two-argument function.
>>> itertools.accumulate([1, 2, 3], lambda x, y: str(x) + str(y))
<itertools.accumulate object at 0x0000028AB1B3B448>
>>> list(_)
[1, '12', '123']
Arguing that it "has to do" something exactly the way `sum()` happens
to be implemented just doesn't follow - not even if they happen to
give the same name to an optional argument. If the function were
named `accumulate_sum()`, and restricted to numeric types, maybe - but
it's not.
[Nick Coghlan <ncoghlan at gmail.com>]
> ...
> That concern mostly goes away if the new parameter is deliberately
> called something *other than* "start" (e.g. "prepend=value", or
> "first=value"), but it could also be addressed by offering a dedicated
> "yield_start" toggle, such that the revised semantics were:
>
> def accumulate(iterable, func=operator.add, start=0, yield_start=False):
> it = iter(iterable)
> total = start
> if yield_start:
> yield total
> for element in it:
> total = func(total, element)
> yield total
>
> That approach would have the advantage of making the default value of
> "start" much easier to document (since it would just be zero, the same
> as it is for sum()), and only the length of the input iterable and
> "yield_start" would affect how many partial sums were produced.
As above, start=0 is senseless for `accumulate` (despite that it makes
sense for `sum`). Raymond gave the obvious implementation in his
original message.
If you reworked your implementation to accommodate that NO sensible
default for `start` exists except for the one Raymond used (a unique
object private to the implementation, so he knows for sure whether or
not `start` was passed), you'd end up with his implementation ;-)
`yield_start` looks like a nuisance in any case. As already
explained, most uses want the `start` value if it's given, and in
cases where it isn't it's trivial to discard by doing `next()` once on
the result. Of course it could be added - but why bother?
More information about the Python-ideas
mailing list