PEP 584 (dict merge operators), dict.update and dict.gapfill
Summary: dict.update(self, other) corresponds to 'merge-right'. Perhaps add dict.gapfill(self, other), which corresponds to 'merge-left'. This post defines dict.gapfill, and discusses update and gapfill in the context of PEP 584. PEP 584 suggests adding a merge operator (to be denoted by '|' or '+') to dictionaries, and a corresponding augmented assignment operator (to be denoted by '|=' or '+='). The main purpose of this post is to remind ourselves that there are both merge-left and merge-right operators, and to begin to discuss the consequences. Perhaps Off Topic: There is already a road sign for 'merge left'. See the following URL for a discussion of usual 'W4-2' sign for this operation, and how it can cause confusion. And how a "small but critical" change to the sign reduced confusion, in a backwards compatible way. https://99percentinvisible.org/article/lane-ends-merge-left-redesigning-w4-2... By the way, that URL writes: "The solution is elegant, simple and additive — it requires no fundamental reformatting, building instead on the existing sign. It is thus also recognizable to those familiar with its predecessor and visually similar to older signs that can still be found on the road." To me, it seems that the author has goals similar to the Python design goals. See also http://www.trafficsign.us/pdf/warn/w4-2mod.pdf, which uses 'Left Lane Ends' instead of 'Merge Right' (and similarly 'Right Lane Ends' for 'Merge Left'). Now Back On Topic. The dict.update(self, other) method gives priority to other. (In other words, it's 'merge-right'.) In March this year, it led me to realise that perhaps dict would benefit from there being a 'merge-left' operator, which gives priority to the left. Here's an example of how merge-left (which I call dict.gapfill) could be coded. (I've done it in a way that emphasises similarities and differences between dict.update and dict.gapfill.) class mydict(dict): def update(self, other): for key, value in other.items(): if True: self[key] = value def gapfill(self, other): for key, value in other.items(): if key not in self: self.key = value def copy(self): return mydict(super().copy()) Here's some simple examples of the semantics.
d1 = mydict(); d1[1] = 'int one' d2 = mydict(); d2[1.0] = 'float one'
c1 = d1.copy(); c1.update(d2) c2 = d1.copy(); c2.gapfill(d2)
c1 {1: 'float one'} c2 {1: 'int one'}
For comparison with sets (similar to dict keys), we also have
{1} | {1.0} {1} {1.0} | {1} {1.0}
Now for some consequences. First, it's my opinion that dict.gapfill(self, other) will be useful when we're given say some command line values, and we wish to augment this with default values. Second, it's my opinion that dict.gapfill(self, other) is in some cases a useful alternative for a dict merge operator (whether it uses '|' or '+' as its symbol). For evidence see https://github.com/jpadilla/pyjwt/blob/master/jwt/utils.py#L71-L81 Third, for sets the '|' operator is a merge-left operator. In other words it gives priority to the set-element (or key) in the first set, rather than the second. (This happens only when the two keys so-to-say are the same value represented by different types. In our case the types are (int, float).) Fourth, as the set '|' operator is merge-left, and Python's
value = A or B or C gives priority to the first (left-most) true value, confusion might result if the dict '|' operator were to be merge-right.
Regarding the fourth point, it is possible that purity and consistency points to the dict '|' operator being merge-left, while usefulness and habit points to the dict '|' operator being merge-right. Fifth and finally. In our context, it is my opinion that merge-left and merge-right don't clearly communicate the differing semantics. However, in my opinion dict.update(self, other) and dict.gapfill(self, other) do clearly communicate their semantics. Here's a link to my post to this list in March this year. https://mail.python.org/archives/list/python-ideas@python.org/message/GRFYX3... with best regards Jonathan
Summary: dict.update(self, other) corresponds to 'merge-right'. Perhaps add dict.gapfill(self, other), which corresponds to 'merge-left'. This post defines dict.gapfill, and discusses update and gapfill in the context of PEP 584. IMHO its a clear/explicit name on what it's going happen, what do you
Hi Jonathan, On 12/27/19 3:16 PM, Jonathan Fine wrote: think on using 'dict.fillgap' for this operation? Regards, francis
On Fri, Dec 27, 2019 at 02:16:43PM +0000, Jonathan Fine wrote:
Summary: dict.update(self, other) corresponds to 'merge-right'. Perhaps add dict.gapfill(self, other), which corresponds to 'merge-left'.
What does "merge-right" mean to you? When I am driving my car, and I see a sign "merge right", that means: - If I am in the right-hand lane, I don't have to do anything (except watch out for other drivers merging into me) since I'm already in the right-hand lane. - If I am in the left-hand lane, I have to merge into the right-hand lane: I move from the left into the right. In the case of `d1.update(d2)` the update method moves (actually copies) keys:values from the right (d2) into the left (d1), which to me is "merge left".
This post defines dict.gapfill
I don't understand the connection between the name "gapfill" and any sort of merge or update operation. To me, "filling a gap" means spraying expanding foam into a hole, or using a spatula or similar tool to spread filler into a hole. In both cases, the critical thing is the gapfilling is unstructured. You just force in some goop, which spreads out into all the nooks and crannies in a uniform, undifferentiated, manner. That doesn't seem like a good analogy to a merge/update to me.
PEP 584 suggests adding a merge operator (to be denoted by '|' or '+') to dictionaries, and a corresponding augmented assignment operator (to be denoted by '|=' or '+='). The main purpose of this post is to remind ourselves that there are both merge-left and merge-right operators, and to begin to discuss the consequences.
The PEP covers merges in both directions: "last seen wins" and "first seen wins". I'm not really sure you're adding anything new here.
Perhaps Off Topic: There is already a road sign for 'merge left'. See the following URL for a discussion of usual 'W4-2' sign for this operation, and how it can cause confusion. And how a "small but critical" change to the sign reduced confusion, in a backwards compatible way.
I see no evidence given that it reduced confusion. I see only the *claim* that adding some dotted lines to the confusing graph was "elegant, simple and additive", but no evidence that it reduced confusion at all. The article goes on to explain why the W4 "merge" sign is confusing to so many people (it violates the conventions used by other signs) and suggests some alternatives. But yes, this is off topic. Python is a text-based programming language, not an icon-based language. We don't have to use icons to hint at the direction of merging, we can use words or order of operands to do so. For example, in the PEP, I discuss how you can merge d2 into (a copy of) d1: new = d1 + d2 # last seen wins, so d2 merges into a copy of d1 or merge d1 into d2, by just changing the order of operands: new = d2 + d1
First, it's my opinion that dict.gapfill(self, other) will be useful when we're given say some command line values, and we wish to augment this with default values.
The usual way we do that is by starting with the default values, and updating them with the more specific values: settings = system_defaults.copy() settings.update(user_defaults) or using the proposed `|` or `+` operator: settings = system_defaults | user_defaults With your "gapfill" method it would be: settings = user_defaults.copy() settings.gapfill(system_defaults) # updates only with new keys This isn't really more expressive that the other way. It's just swapping the order of operands.
Second, it's my opinion that dict.gapfill(self, other) is in some cases a useful alternative for a dict merge operator (whether it uses '|' or '+' as its symbol). For evidence see https://github.com/jpadilla/pyjwt/blob/master/jwt/utils.py#L71-L81
What am I looking at? The "merge_dict" function? I think you need to look at it a bit more closely, because it does the same "last value seen wins" semantics of dict.update and the proposed operator, *not* the "first seen value wins" of your gapfill method. (The function, as defined, also makes a dubious design choice in that sometimes it returns a copy of its arguments and sometimes it doesn't.) [...]
Fifth and finally. In our context, it is my opinion that merge-left and merge-right don't clearly communicate the differing semantics.
We can agree on that. -- Steven
On Fri, Dec 27, 2019, 12:05 PM Steven D'Aprano <steve@pearwood.info> wrote:
On Fri, Dec 27, 2019 at 02:16:43PM +0000, Jonathan Fine wrote:
Summary: dict.update(self, other) corresponds to 'merge-right'. Perhaps add dict.gapfill(self, other), which corresponds to 'merge-left'.
I don't understand the connection between the name "gapfill" and any sort
of merge or update operation. To me, "filling a gap" means spraying expanding foam into a hole, or using a spatula or similar tool to spread filler into a hole.
Regardless of the function or method name potentially chosen, I definitely see the difference between: dict1.update(dict2) dict1.add_stuff_no_replace(dict2) I've been occasionally bitten by wanting the second rather than the first. So that's an argument in favor of adding something. Arguments against: 1. People should use collections.ChainMap more. Probably half the time anyone uses dict.update(), ChainMap would have been a better choice (in my code especially :-)) ChainMap(dict1, dict2) ChainMap(dict2, dict1) Those two have a perfectly clear distinction. Neither mutates the original dicts, and I could add dict3 to the places to look with no more lines of code. 2. The "left wins" semantics is less common and a function is perfectly adequate. Just write you're own dictutils.gapfill() and we're done. It's even worth having a module on PyPI if you can think of a half dozen more ways to useful combine dictionaries.
participants (5)
-
David Mertz
-
francismb
-
Jonathan Fine
-
Marco Sulla
-
Steven D'Aprano