Steven D'Aprano wrote:
On Sun, Oct 20, 2019 at 11:48:10PM -0000, Dominik Vilsmeier wrote:
Regarding "|" operator, I think a drawback is the resemblance with "or" (after all it's associated with "__or__") so people might assume behavior similar to x or y where x takes precedence (for truthy
values of x). So when reading d1 | d2 one could falsely assume that values in d1 take precedence over the ones in d2 for conflicting keys. And this is also the existing set behavior (though it's not really relevant in this case): There's a much easier way to demonstrate what you did:
{1} | {1.0} {1}
In any case, dict.update already has this behaviour:
d = {1: 'a'} d.update({1.0: 'A'}) d {1: 'A'}
The existing key is kept, only the value is changed. The PEP gives a proposed implementation, which if I remember correctly is: # d1 | d2 d = d1.copy() d.update(d2)
so it will keep the current dict behaviour:
keys are stable (first key seen wins) values are updated (last value seen wins)
Exactly, so the dict "+" behavior would match the set "|" behavior, preserving the keys. But how many users will be concerned about whether the keys are going to be preserved? I guess almost everybody will want to know what happens with the values, and that question remains unanswered by just looking at the "+" or "|" syntax. It's reasonable to assume that values are preserved as well, i.e. `d1 + d2` adds the missing keys from `d2` to `d1`. Of course, once you know that "+" is actually similar to "update" you can infer that the last value wins. But "+" simply doesn't read "update". So in order to know you'll have to look it up, but following that argument you could basically settle on any operator symbol for the update operation. A drawback of "+" is that different interpretations are plausible, and this fact cannot be denied as can be seen from the ongoing discussion. Of course one can blame the programmer, if they didn't check the documentation carefully enough, also since "in the face of ambiguity, refuse the temptation to guess". But in the end the language should assist the programmer and it's better not to introduce ambiguity in the first place.
I think that, strictly speaking, this "keys are stable" behaviour is not guaranteed by the language reference. But it's probably so deeply built into the implementation of dicts that it is unlike to ever change. (I think Guido mentioned something about it being a side-effect of the way dict __setitem__ works?)