On Tue, Mar 5, 2019 at 3:50 PM Josh Rosenberg <shadowranger+pythonideas@gmail.com> wrote:

On Tue, Mar 5, 2019 at 11:16 PM Steven D'Aprano <steve@pearwood.info> wrote:
On Sun, Mar 03, 2019 at 09:28:30PM -0500, James Lu wrote:

> I propose that the + sign merge two python dictionaries such that if
> there are conflicting keys, a KeyError is thrown.

This proposal is for a simple, operator-based equivalent to
dict.update() which returns a new dict. dict.update has existed since
Python 1.5 (something like a quarter of a century!) and never grown a
"unique keys" version.

I don't recall even seeing a request for such a feature. If such a
unique keys version is useful, I don't expect it will be useful often.

I have one argument in favor of such a feature: It preserves concatenation semantics. + means one of two things in all code I've ever seen (Python or otherwise):

1. Numeric addition (including element-wise numeric addition as in Counter and numpy arrays)
2. Concatenation (where the result preserves all elements, in order, including, among other guarantees, that len(seq1) + len(seq2) == len(seq1 + seq2))

dict addition that didn't reject non-unique keys wouldn't fit *either* pattern; the main proposal (making it equivalent to left.copy(), followed by .update(right)) would have the left hand side would win on ordering, the right hand side on values, and wouldn't preserve the length invariant of concatenation. At least when repeated keys are rejected, most concatenation invariants are preserved; order is all of the left elements followed by all of the right, and no elements are lost.

I must by now have seen dozens of post complaining about this aspect of the proposal. I think this is just making up rules (e.g. "+ never loses information") to deal with an aspect of the design where a *choice* must be made. This may reflect the Zen of Python's "In the face of ambiguity, refuse the temptation to guess." But really, that's a pretty silly rule (truly, they aren't all winners). Good interface design constantly makes choices in ambiguous situations, because the alternative is constantly asking, and that's just annoying.

We have a plethora of examples (in fact, almost all alternatives considered) of situations related to dict merging where a choice is made between conflicting values for a key, and it's always the value further to the right that wins: from d[k] = v (which overrides the value when k is already in the dict) to d1.update(d2) (which lets the values in d2 win), including the much lauded {**d1, **d2} and even plain {'a': 1, 'a': 2} has a well-defined meaning where the latter value wins.

As to why raising is worse: First, none of the other situations I listed above raises for conflicts. Second, there's the experience of str+unicode in Python 2, which raises if the str argument contains any non-ASCII bytes. In fact, we disliked it so much that we changed the language incompatibly to deal with it.

--Guido van Rossum (python.org/~guido)