On Wed, Mar 6, 2019 at 10:31 PM Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:

You might as well say that using the + operator on vectors is
nonsense, because len(v1 + v2) is not in general equal to
len(v1) + len(v2).

Yet mathematicians are quite happy to talk about "addition"
of vectors.

Vectors addition is *actual* addition, not concatenation. You're so busy loosening the definition of + as relates to , to make it make sense for dicts that you've forgotten that + is, first and foremost, about addition in the mathematical sense, where vector addition is just one type of addition. Concatenation is already a minor abuse of +, but one commonly accepted by programmers, thanks to it having some similarities to addition and a single, unambiguous set of semantics to avoid confusion.

You're defending + on dicts because vector addition isn't concatenation already, which only shows how muddled things get when you try to use + to mean multiple concepts that are at best loosely related.

The closest I can come to a thorough definition of what + does in Python (and most languages) right now is that:

1. Returns a new thing of the same type (or a shared coerced type for number weirdness)

2. That combines the information of the input operands

3. Is associative ((a + b) + c produces the same thing as a + (b + c)) (modulo floating point weirdness)

4. Is "reversible": Knowing the end result and *one* of the inputs is sufficient to determine the value of the other input; that is, for c = a + b, knowing any two of a, b and c allows you to determine a single unambiguous value for the remaining value (numeric coercion and floating point weirdness make this not 100%, but you can at least know a value equal to other value; e.g. for c = a + b, knowing c is 5.0 and a is 1.0 is sufficient to say that b is equal to 4, even if it's not necessarily an int or float). For numbers, reversal is done with -; for sequences, it's done by slicing c using the length of a or b to "subtract" the elements that came from a/b.

5. (Actual addition only) Is commutative (modulo floating point weirdness); a + b == b + a

6. (Concatenation only) Is order preserving (really a natural consequence of #4, but a property that people expect)

Note that these rules are consistent across most major languages that allow + to mean combine collections (the few that disagree, like Pascal, don't support | as a union operator).

Concatenation is missing element #5, but otherwise aligns with actual addition. dict merges (and set unions for that matter) violate #4 and #6; for c = a + b, knowing c and either a or b still leaves a literally infinite set of possible inputs for the other input (it's not infinite for sets, where the options would be a subset of the result, but for dicts, there would be no such limitation; keys from b could exist with any possible value in a). dicts order preserving aspect *almost* satisfies #6, but not quite (if 'x' comes after 'y' in b, there is no guarantee that it will do so in c, because a gets first say on ordering, and b gets the final word on value).

Allowing dicts to get involved in + means:

1. Fewer consistent rules apply to +;

2. The particular idiosyncrasies of Python dict ordering and "which value wins" rules are now tied to +. for concatenation, there is only one set of possible rules AFAICT so every language naturally agrees on behavior, but dict merging obviously has many possible rules that would be unlikely to match the exact rules of any other language except by coincidence). a winning on order and b winning on value is a historical artifact of how Python's dict developed; I doubt any other language would intentionally choose to split responsibility like that if they weren't handcuffed by history.

Again, there's nothing wrong with making dict merges easier. But it shouldn't be done by (further) abusing +.

-Josh Rosenberg