<div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Mar 5, 2019 at 11:16 PM Steven D'Aprano <<a href="mailto:steve@pearwood.info">steve@pearwood.info</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On Sun, Mar 03, 2019 at 09:28:30PM -0500, James Lu wrote:<br>

<br>

> I propose that the + sign merge two python dictionaries such that if <br>

> there are conflicting keys, a KeyError is thrown.<br>

<br>

This proposal is for a simple, operator-based equivalent to <br>

dict.update() which returns a new dict. dict.update has existed since <br>

Python 1.5 (something like a quarter of a century!) and never grown a <br>

"unique keys" version.<br>

<br>

I don't recall even seeing a request for such a feature. If such a <br>

unique keys version is useful, I don't expect it will be useful often.<br>

<br></blockquote><div><br></div><div>I have one argument in favor of such a feature: It preserves concatenation semantics. + means one of two things in all code I've ever seen (Python or otherwise):</div><div><br></div><div>1. Numeric addition (including element-wise numeric addition as in Counter and numpy arrays)</div><div>2. Concatenation (where the result preserves all elements, in order, including, among other guarantees, that len(seq1) + len(seq2) == len(seq1 + seq2))</div><div><br></div><div>dict addition that didn't reject non-unique keys wouldn't fit *either* pattern; the main proposal (making it equivalent to left.copy(), followed by .update(right)) would have the left hand side would win on ordering, the right hand side on values, and wouldn't preserve the length invariant of concatenation. At least when repeated keys are rejected, most concatenation invariants are preserved; order is all of the left elements followed by all of the right, and no elements are lost.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

<br>

> This way, d1 + d2 isn’t just another obvious way to do {**d1, **d2}.<br>

<br>

One of the reasons for preferring + is that it is an obvious way to do <br>

something very common, while {**d1, **d2} is as far from obvious as you <br>

can get without becoming APL or Perl :-)<br>

<br></blockquote><div><br></div><div>From the moment PEP 448 published, I've been using unpacking as a more composable/efficient form of concatenation, merging, etc. I'm sorry you don't find it obvious, but a couple e-mails back you said:</div><div><br>

"The Zen's prohibition against guessing in the face of ambiguity does not <br>

mean that we must not add a feature to the language that requires the <br>

user to learn what it does first.<span class="gmail-im">"</span></div><div><span class="gmail-im"><br></span></div><div>Learning to use the unpacking syntax in the case of function calls is necessary for tons of stuff (writing general function decorators, handling initialization in class hierarchies, etc.), and as PEP 448 is titled, this is just a generalization combining the features of unpacking arguments with collection literals.</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

> The second syntax makes it clear that a new dictionary is being <br>

> constructed and that d2 overrides keys from d1.<br>

<br>

Only because you have learned the rule that {**d, **e) means to <br>

construct a new dict by merging, with the rule that in the event of <br>

duplicate keys, the last key seen wins. If you hadn't learned that rule, <br>

there is nothing in the syntax which would tell you the behaviour. We <br>

could have chosen any rule we liked:<br>

<br></blockquote><div><br></div><div>No, because we learned the general rule for dict literals that {'a': 1, 'a': 2} produces {'a': 2}; the unpacking generalizations were very good about adhering to the existing rules, so it was basically zero learning curve if you already knew dict literal rules and less general unpacking rules. The only part to "learn" is that when there is a conflict between dict literal rules and function call rules, dict literal rules win.<br></div></div><div class="gmail_quote"><br></div><div class="gmail_quote">To be clear: I'm not supporting + as raising error on non-unique keys. Even if it makes dict + dict adhere to the rules of concatenation, I don't think it's a common or useful functionality. My order of preferences is roughly:</div><div class="gmail_quote"><br></div><div class="gmail_quote">1. Do nothing (even if you don't like {**d1, **d2}, .copy() followed by .update() is obvious, and we don't need more than one way to do it)</div><div class="gmail_quote">2. Add a new method to dict, e.g. dict.merge (whether it's a class method or an instance method is irrelevant to me)</div><div class="gmail_quote">3. Use | (because dicts are *far* more like sets than they are like sequences, and the semi-lossy rules of unioning make more sense there); it would also make - make sense, since + is only matched by - in numeric contexts; on collections, | and - are paired. And I consider the - functionality the most useful part of this whole proposal (because I *have* wanted to drop a collection of known blacklisted keys from a dict and while it's obvious you can do it by looping, I always wanted to be able to do something like d1.keys() -= badkeys, and remain disappointed nothing like it is available)</div><div class="gmail_quote"><br></div><div class="gmail_quote">-Josh Rosenberg<br></div></div>