[Python-ideas] PEP: Dict addition and subtraction

Wed Mar 6 18:51:13 EST 2019

```On Wed, Mar 6, 2019 at 10:31 PM Greg Ewing <greg.ewing at canterbury.ac.nz>
wrote:

>
> You might as well say that using the + operator on vectors is
> nonsense, because len(v1 + v2) is not in general equal to
> len(v1) + len(v2).
>
> of vectors.
>
>
loosening the definition of + as relates to , to make it make sense for
dicts that you've forgotten that + is, first and foremost, about addition
in the mathematical sense, where vector addition is just one type of
addition. Concatenation is already a minor abuse of +, but one commonly
accepted by programmers, thanks to it having some similarities to addition
and a single, unambiguous set of semantics to avoid confusion.

You're defending + on dicts because vector addition isn't concatenation
already, which only shows how muddled things get when you try to use + to
mean multiple concepts that are at best loosely related.

The closest I can come to a thorough definition of what + does in Python
(and most languages) right now is that:

1. Returns a new thing of the same type (or a shared coerced type for
number weirdness)
2. That combines the information of the input operands
3. Is associative ((a + b) + c produces the same thing as a + (b + c))
(modulo floating point weirdness)
4. Is "reversible": Knowing the end result and *one* of the inputs is
sufficient to determine the value of the other input; that is, for c = a +
b, knowing any two of a, b and c allows you to determine a single
unambiguous value for the remaining value (numeric coercion and floating
point weirdness make this not 100%, but you can at least know a value equal
to other value; e.g. for c = a + b, knowing c is 5.0 and a is 1.0 is
sufficient to say that b is equal to 4, even if it's not necessarily an int
or float). For numbers, reversal is done with -; for sequences, it's done
by slicing c using the length of a or b to "subtract" the elements that
came from a/b.
5. (Actual addition only) Is commutative (modulo floating point weirdness);
a + b == b + a
6. (Concatenation only) Is order preserving (really a natural consequence
of #4, but a property that people expect)

Note that these rules are consistent across most major languages that allow
+ to mean combine collections (the few that disagree, like Pascal, don't
support | as a union operator).

Concatenation is missing element #5, but otherwise aligns with actual
addition. dict merges (and set unions for that matter) violate #4 and #6;
for c = a + b, knowing c and either a or b still leaves a literally
infinite set of possible inputs for the other input (it's not infinite for
sets, where the options would be a subset of the result, but for dicts,
there would be no such limitation; keys from b could exist with any
possible value in a). dicts order preserving aspect *almost* satisfies #6,
but not quite (if 'x' comes after 'y' in b, there is no guarantee that it
will do so in c, because a gets first say on ordering, and b gets the final
word on value).

Allowing dicts to get involved in + means:

1. Fewer consistent rules apply to +;
2. The particular idiosyncrasies of Python dict ordering and "which value
wins" rules are now tied to +. for concatenation, there is only one set of
possible rules AFAICT so every language naturally agrees on behavior, but
dict merging obviously has many possible rules that would be unlikely to
match the exact rules of any other language except by coincidence). a
winning on order and b winning on value is a historical artifact of how
Python's dict developed; I doubt any other language would intentionally
choose to split responsibility like that if they weren't handcuffed by
history.

Again, there's nothing wrong with making dict merges easier. But it
shouldn't be done by (further) abusing +.

-Josh Rosenberg
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20190306/6ee56e17/attachment.html>
```