
On Mon, Mar 04, 2019 at 03:33:36PM -0500, Neil Girdhar wrote:
Maybe, but reading through the various replies, it seems that if you are adding "-" to be analogous to set difference, then the combination operator should be analogous to set union "|".
That's the purpose of this discussion, to decide whether dict merging is more like addition/concatenation or union :-)
And it also opens an opportunity to add set intersection "&".
What should intersection do in the case of matching keys? I see the merge + operator as a kind of update, whether it makes a copy or does it in place, so to me it is obvious that "last seen wins" should apply just as it does for the update method. But dict *intersection* is a more abstract operation than merge/update. And that leads to the problem, what do you do with the values? {key: "spam"} & {key: "eggs"} # could result in any of: {key: "spam"} {key: "eggs"} {key: ("spam", "eggs")} {key: "spameggs"} an exception something else? Unlike "update", I don't have any good use-cases to prefer any one of those over the others.
After all, how do you filter a dictionary to a set of keys?
d = {'some': 5, 'extra': 10, 'things': 55} d &= {'some', 'allowed', 'options'} d {'some': 5}
new = d - (d - allowed) {k:v for (k,v) in d if k in allowed}
* Regarding how to construct the new set in __add__, I now think this should be done like this:
class dict: <other methods> def __add__(self, other): <checks that other makes sense, else return NotImplemented> new = self.copy() # A subclass may or may not choose to override new.update(other) return new
I like that, but it would be inefficient to do that for __sub__ since it would create elements that it might later delete.
def __sub__(self, other): new = self.copy() for k in other: del new[k] return new
is less efficient than
def __sub__(self, other): return type(self)({k: v for k, v in self.items() if k not in other})
I don't think you should be claiming what is more or less efficient unless you've actually profiled them for speed and memory use. Often, but not always, the two are in opposition: we make things faster by using more memory, and save memory at the cost of speed. Your version of __sub__ creates a temporary dict, which then has to be copied in order to preserve the type. Its not obvious to me that that's faster or more memory efficient than building a dict then deleting keys. (Remember that dicts aren't lists, and deleting keys is an O(1) operation.) -- Steven