[Python-ideas] collections.Counter should implement __mul__, __rmul__
Tim Peters
tim.peters at gmail.com
Wed Apr 18 15:34:15 EDT 2018
[Serhiy Storchaka <storchaka at gmail.com>]
> There are methods update() and subtract() which are similar to operators "+"
> and "-", but don't discard non-positive values.
Yup.
> I expect that "*" and "/" discard non-positive values for consistency with
> "+" and "-". And a new method should be added which does multiplication
> without discarding non-positive values.
Counter supports a wonderfully weird mix of methods driven by use
cases, not by ideology.
+ (binary)
- (binary)
|
&
have semantics driven by viewing a Counter as a multiset
implementation. That's why they discard values <= 0. They
correspond, respectively, to "the standard" multiset operations of sum
(disjoint union), difference, union, and intersection.
That the unary versions of '+' and '-' also discard values <= 0 is
justified by saying "because they're shorthand for what the binary
operator does when given an empty Counter as the left argument", but
they're not standard multiset operations on their own.
Nothing else in Counter is trying to cater to the multiset view, but
to other use cases. And that's why "*" and "/" should do what
everyone _expects_ them to do ;-) There are no analogous multiset
operations to justify them caring at all what the values are.
If Raymond had it to do over again, I'd suggest that only "-" discard
values <= 0. The other operators deliver legit(*) multisets _given_
that their arguments are legit multisets - only "-" has to care about
creating an illegitimate (for a multiset) value from legit multiset
arguments.
But there there's no good reason for "*" or "/" to care at all. They
don't make sense for multisets. After, e.g.,
c /= sum(c.values())
it's sane to expect that the new sum(c.values()) is close to 1
regardless of the numeric types or signs of the original values.
Indeed, normalizing values so that their sum _is_ close to 1 is a
primary use case motivating the current change.
Note I suggested before rearranging the docs to make clear that the
multiset view is just a part of what Counter is intended to be used
for, and that only a handful of specific operations are intended to
support it.
(*) "legit" meaning that all values are integers > 0
More information about the Python-ideas
mailing list