[Python-ideas] PEP: Dict addition and subtraction
Guido van Rossum
guido at python.org
Fri Mar 8 11:55:43 EST 2019
On Thu, Mar 7, 2019 at 9:12 PM Stephen J. Turnbull <
turnbull.stephen.fw at u.tsukuba.ac.jp> wrote:
> Ka-Ping Yee writes:
> > On Wed, Mar 6, 2019 at 4:01 PM Chris Angelico <rosuav at gmail.com> wrote:
> > > But adding dictionaries is fundamentally *useful*. It is expressive.
> > It is useful. It's just that + is the wrong name.
> First, let me say that I prefer ?!'s position here, so my bias is made
> apparent. I'm also aware that I have biases so I'm sympathetic to
> those who take a different position.
TBH, I am warming up to "|" as well.
> Rather than say it's "wrong", let me instead point out that I think
> it's pragmatically troublesome to use "+". I can think of at least
> four interpretations of "d1 + d2"
> 1. update
> 2. multiset (~= Collections.Counter addition)
I guess this explains the behavior of removing results <= 0; it makes sense
as multiset subtraction, since in a multiset a negative count makes little
sense. (Though the name Counter certainly doesn't seem to imply multiset.)
> 3. addition of functions into the same vector space (actually, a
> semigroup will do ;-), and this is the implementation of
> 4. "fiberwise" set addition (ie, of functions into relations)
> and I'm very jet-lagged so I may be missing some.
> There's also the fact that the operations denoted by "|" and "||" are
> often implemented as "short-circuiting", and therefore not
> commutative, while "+" usually is (and that's reinforced for
> mathematicians who are trained to think of "+" as the operator for
> Abelian groups, while "*" is a (possibly) non-commutative operator. I
> know commutativity of "+" has been mentioned before, but the
> non-commutativity of "|" -- and so unsuitability for many kinds of
> dict combination -- hasn't been emphasized before IIRC.
I've never heard of single "|" being short-circuiting. ("||" of course is
infamous for being that in C and most languages derived from it.)
And "+" is of course used for many non-commutative operations in Python
(e.g. adding two lists/strings/tuples together). It is only *associative*,
a weaker requirement that just says (A + B) + C == A + (B + C). (This is
why we write A + B + C, since the grouping doesn't matter for the result.)
Anyway, while we're discussing mathematical properties, and since SETL was
briefly mentioned, I found an interesting thing in math. For sets, union
and intersection are distributive over each other. I can't type the
operators we learned in high school, so I'll use Python's set operations.
We find that A | (B & C) == (A | B) & (A | C). We also find that A & (B |
C) == (A & B) | (A & C).
Note that this is *not* the case for + and * when used with (mathematical)
numbers: * distributes over +: a * (b + c) == (a * b) + (a * c), but + does
not distribute over *: a + (b * c) != (a + b) * (a + c). So in a sense,
SETL (which uses + and * for union and intersection) got the operators
Note that in Python, + and * for sequences are not distributive this way,
since (A + B) * n is not the same as (A * n) + (B * n). OTOH A * (n + m) ==
A * n + A * m. (Assuming A and B are sequences of the same type, and n and
m are positive integers.)
If we were to use "|" and "&" for dict "union" and "intersection", the
mutual distributive properties will hold.
> Since "|" (especially "|=") *is* suitable for "update", I think we
> should reserve "+" for some future commutative extension.
One argument is that sets have an update() method aliased to "|=", so this
makes it more reasonable to do the same for dicts, which also have a.
update() method, with similar behavior (not surprising, since sets were
modeled after dicts).
> In the spirit of full disclosure:
> Of these, 2 is already implemented and widely used, so we don't need
> to use dict.__add__ for that. I've never seen 4 in the mathematical
> literature (union of relations is not the same thing). 3, however, is
> very common both for mappings with small domain and sparse
> representation of mappings with a default value (possibly computed
> then cached), and "|" is not suitable for expressing that sort of
> addition (I'm willing to say it's "wrong" :-).
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Python-ideas