Steven D'Aprano wrote:
On Sat, Oct 19, 2019 at 02:02:43PM -0400, David Mertz wrote:
The plus operation on two dictionaries feels far more natural as a vectorised merge, were it to mean anything. E.g., I'd expect {'a': 5, 'b': 4} + {'a': 3, 'b': 1} {'a': 8, 'b': 5} Outside of Counter when would this behaviour be useful?
For example one could use dicts to represent data tables, with the keys being either indices or column names and the values being lists (rows or columns). Then for joining two such tables it would be desirable if values are added, because then you could simply do `joint_table = table1 + table2`. Or having a list of records from different sources: purchases_online = {'item1': [datetime1, datetime2, ...], 'item2': ...} purchases_store = {'item1': [datetime3, datetime4, ...], ...} purchases_overall = purchases_online + purchases_store # Records should be concatenated. # Then doing some analysis on the overall purchases. `pandas.Series` also behaves dict-like (almost) and does add the values on "+".
I expect that this feels natural to you because you're thinking about simple (dare I say "toy"?) examples like the above, rather than practical use-cases like "merging multiple preferences": prefs = defaults + system_prefs + user_prefs
# or if you prefer the alternative syntax prefs = defaults | system_prefs | user_prefs
(Note that in this case, the lack of commutativity is a good thing: we want the last seen value to win.)
In this case you'd have to infer the order of precedence from the variable names, not the "+" syntax itself. I.e. if you had spelled it `a + b + c` I would have no idea whether `a` or `c` has highest precedence. Compare that with a "directed" operator symbol (again, I'm not particularly arguing for "<<"): prefs = defaults << system_prefs << user_prefs Here it becomes immediately clear that `system_prefs` supersedes `defaults` and `user_prefs` supersedes the other two. A drawback of "+" here is that you can't infer this information from the syntax itself. Also I'm not sure if this is a good example, since in case something in `system_prefs` changes you'd have to recompute the whole thing (`prefs`), since you can't tell whether that setting was overwritten by `user_prefs`. I think in such a case it would be better to use `collections.ChainMap` for providing a hierarchy of preferences, which let's you easily update each level.
Dicts are a key:value store, not a multiset, and outside of specialised subclasses like Counter, we can't expect that adding the values is meaningful or even possible. "Adding the values" is too specialised and not general enough for dicts, as a slightly less toy example might show: d = ({'customerID': 12932063, 'purchaseHistory': <Purchases object at 0xb7ce14d0>, 'name': 'Joe Consumer', 'rewardsID': 391187} + {'name': 'Jane Consumer', 'rewardsID': 445137} )
Having d['name'] to be 'Joe ConsumerJane Consumer' and d['rewardsID'] to be 836324 would be the very opposite of useful behaviour.
I agree that adding the values doesn't make sense for that example but neither does updating the values. Why would you want to take a record corresponding to "Joe Consumer" and partially update it with data from another consumer ("Jane Consumer")? Actually I couldn't tell what the result of that example should be.