On Sun, Mar 03, 2019 at 09:28:30PM -0500, James Lu wrote:
I propose that the + sign merge two python dictionaries such that if there are conflicting keys, a KeyError is thrown.
This proposal is for a simple, operator-based equivalent to dict.update() which returns a new dict. dict.update has existed since Python 1.5 (something like a quarter of a century!) and never grown a "unique keys" version. I don't recall even seeing a request for such a feature. If such a unique keys version is useful, I don't expect it will be useful often.
This way, d1 + d2 isn’t just another obvious way to do {**d1, **d2}.
One of the reasons for preferring + is that it is an obvious way to do something very common, while {**d1, **d2} is as far from obvious as you can get without becoming APL or Perl :-) If I needed such a unique key version of update, I'd use a subclass: class StrictDict(dict): def __add__(self, other): if isinstance(other, dict) and (self.keys() & other.keys()): raise KeyError('non-unique keys') return super().__add__(self, other) # and similar for __radd__. rather than burden the entire language, and every user of it, with having to learn the subtle difference between the obvious + operator and the error-prone and unobvious trick of {*d1, *d2}. ( Did you see what I did there? *wink* )
The second syntax makes it clear that a new dictionary is being constructed and that d2 overrides keys from d1.
Only because you have learned the rule that {**d, **e) means to construct a new dict by merging, with the rule that in the event of duplicate keys, the last key seen wins. If you hadn't learned that rule, there is nothing in the syntax which would tell you the behaviour. We could have chosen any rule we liked: - raise an exception, like you get a TypeError if you pass the same keyword argument to a function twice: spam(foo=1, foo=2); - first value seen wins; - last value seen wins; - random value wins; - anything else we liked! There is nothing "clear" about the syntax which makes it obvious which behaviour is implemented. We have to learn it.
One can reasonably expect or imagine a situation where a section of code that expects to merge two dictionaries with non-conflicting keys commits a semantic error if it merges two dictionaries with conflicting keys.
I can imagine it, but I don't think I've ever needed it, and I can't imagine wanting it often enough to wish it was not just a built-in function or method, but actual syntax. Do you have some real examples of wanting an error when trying to update a dict if keys match?
To better explain, imagine a program where options is a global variable storing parsed values from the command line.
def verbose_options(): if options.quiet return {'verbose': True}
def quiet_options(): if options.quiet: return {'verbose': False}
That seems very artifical to me. Why not use a single function: def verbose_options(): # There's more than one? return {'verbose': not options.quiet} The way you have written those functions seems weird to me. You already have a nice options object, with named fields like "options.quiet", why are you turning it into not one but *two* different dicts, both reporting the same field? And its buggy: if options.quiet is True, then the key 'quiet' should be True, not the 'verbose' key. Do you have *two* functions for every preference setting that takes a true/false flag? What do you do for preference settings that take multiple values? Create a vast number of specialised functions, one for each possible value? def A4_page_options(): if options.page_size == 'A4': return {'page_size': 'A4'} def US_Letter_page_options(): if options.page_size == 'US Letter': return {'page_size': 'US Letter'} page_size = ( A4_page_options() + A3_page_options() + A5_page_options() + Foolscape_page_options + Tabloid_page_options() + US_Letter_page_options() + US_Legal_page_options() # and about a dozen more... ) The point is, although I might be wrong, I don't think that this example is a practical, realistic use-case for a unique keys version of update. To me, your approach seems so complicated and artificial that it seems like it was invented specifically to justify this "unique key" operator, not something that we would want to write in real life. But even if it real code, the question is not whether it is EVER useful for a dict update to raise an exception on matching keys. The question is whether this is so often useful that this is the behaviour we want to make the default for dicts. [...]
Again, I propose that the + sign merge two python dictionaries such that if there are conflicting keys, a KeyError is thrown, because such “non-conflicting merge” behavior would be useful in Python.
I don't think it would be, at least not often. If it were common enough to justify a built-in operator to do this, we would have had many requests for a dict.unique_update or similar by now, and I don't think we have.
It gives clarifying power to the + sign. The + and the {**, **} should serve different roles.
In other words, explicit + is better than implicit {**, **#, unless explicitly suppressed. Here + is explicit whereas {**, **} is implicitly allowing inclusive keys,
If I had a cent for every time people misused "explicit" to mean "the proposal that I like", I'd be rich. In what way is the "+" operator *explicit* about raising an exception on duplicate keys? These are both explicit: merge_but_raise_exception_if_any_duplicates(d1, d2) merge(d1, d2, raise_if_duplicates=True) and these are both equally implicit: d1 + d2 {**d1, **d2} since the behaviour on duplicates is not explicitly stated in clear and obvious language, but implied by the rules of the language. [...]
People expect the + operator to be commutative
THey are wrong to expect that, because the + operator is already not commutative for: str bytes bytearray list tuple array.array collections.deque collections.Counter and possibly others. -- Steven