[Python-ideas] PEP: Dict addition and subtraction

Michael Lee michael.lee.0x2a at gmail.com
Wed Mar 6 08:58:46 EST 2019


>
> I strongly agree with Ka-Ping. '+' is intuitively concatenation not
> merging. The behavior is overwhelmingly more similar to the '|' operator in
> sets (whether or not a user happens to know the historical implementation
> overlap).


I think the behavior proposed in the PEP makes sense whether you think of
"+" as meaning "concatenation" or "merging".

If your instinct is to assume "+" means "concatenation", then it would be
natural to assume that {"a": 1, "b": 2} + {"c": 3, "b": 4} would be
identical to {"a": 1, "b": 2, "c": 3, "b": 4} -- literally concat the
key-value pairs into a new dict.

But of course, you can't have duplicate keys in Python. So, you would
either recall or look up how duplicate keys are handled when constructing a
dict and learn that the rule is that the right-most key wins. So the
natural conclusion is that "+" would follow this existing rule -- and you
end up with exactly the behavior described in the PEP.

This also makes explaining the behavior of "d1 + d2" slightly easier than
explaining "d1 | d2". For the former, you can just say "d1 + d2 means we
concat the two dicts together" and stop there. You almost don't need to
explain the merging/right-most key wins behavior at all, since that
behavior is the only one consistent with the existing language rules.

In contrast, you *would* need to explain this with "d1 | d2": I would
mentally translate this expression to mean "take the union of these two
dicts" and there's no real way to deduce which key-value pair ends up in
the final dict given that framing. Why is it that key-value pairs in d2 win
over pairs in d1 here? That choice seems pretty arbitrary when you think of
this operation in terms of unions, rather than either concat or merge.

Using "|" would also violate an important existing property of unions: the
invariant "d1 | d2 == d2 | d1" is no longer true. As far as I'm aware, the
union operation is always taken to be commutative in math, and so I think
it's important that we preserve that property in Python. At the very least,
I think it's far more important to preserve commutativity of unions then it
is to preserve some of the invariants I've seen proposed above, like
"len(d1 + d2) == len(d1) + len(d2)".

Personally, I don't really have a strong opinion on this PEP, or the other
one I've seen proposed where we add a "d1.merge(d2, d3, ...)". But I do
know that I'm a strong -1 on adding set operations to dicts: it's not
possible to preserve the existing semantics of union (and intersection)
with dict and  think expressions like "d1 | d2" and "d1 & d2" would just be
confusing and misleading to encounter in the wild.

-- Michael



On Wed, Mar 6, 2019 at 4:53 AM David Mertz <mertz at gnosis.cx> wrote:

> I strongly agree with Ka-Ping. '+' is intuitively concatenation not
> merging. The behavior is overwhelmingly more similar to the '|' operator in
> sets (whether or not a user happens to know the historical implementation
> overlap).
>
> I think growing the full collection of set operations world be a pleasant
> addition to dicts. I think shoe-horning in plus would always be jarring to
> me.
>
> On Wed, Mar 6, 2019, 5:30 AM Ka-Ping Yee <zestyping at gmail.com> wrote:
>
>> len(dict1 + dict2) does not equal len(dict1) + len(dict2), so using the +
>> operator is nonsense.
>>
>> len(dict1 + dict2) cannot even be computed by any expression
>> involving +.  Using len() to test the semantics of the operation is not
>> arbitrary; the fact that the sizes do not add is a defining quality of a
>> merge.  This is a merge, not an addition.  The proper analogy is to sets,
>> not lists.
>>
>> The operators should be |, &, and -, exactly as for sets, and the
>> behaviour defined with just three rules:
>>
>> 1. The keys of dict1 [op] dict2 are the elements of dict1.keys() [op]
>> dict2.keys().
>>
>> 2. The values of dict2 take priority over the values of dict1.
>>
>> 3. When either operand is a set, it is treated as a dict whose values are
>> None.
>>
>> This yields many useful operations and, most importantly, is simple to
>> explain.  "sets and dicts can |, &, -" takes up less space in your brain
>> than "sets can |, &, - but dicts can only + and -, where dict + is like set
>> |".
>>
>> merge and update some items:
>>
>>     {'a': 1, 'b': 2} | {'b': 3, 'c': 4} => {'a': 1, 'b': 3, 'c': 4}
>>
>> pick some items:
>>
>>     {'a': 1, 'b': 2} & {'b': 3, 'c': 4} => {'b': 3}
>>
>> remove some items:
>>
>>     {'a': 1, 'b': 2} - {'b': 3, 'c': 4} => {'a': 1}
>>
>> reset values of some keys:
>>
>>     {'a': 1, 'b': 2} | {'b', 'c'} => {'a': 1, 'b': None, 'c': None}
>>
>> ensure certain keys are present:
>>
>>     {'b', 'c'} | {'a': 1, 'b': 2} => {'a': 1, 'b': 2, 'c': None}
>>
>> pick some items:
>>
>>     {'b', 'c'} | {'a': 1, 'b': 2} => {'b': 2}
>>
>> remove some items:
>>
>>     {'a': 1, 'b': 2} - {'b', 'c'} => {'a': 1}
>>
>> On Wed, Mar 6, 2019 at 1:51 AM Rémi Lapeyre <remi.lapeyre at henki.fr>
>> wrote:
>>
>>> Le 6 mars 2019 à 10:26:15, Brice Parent
>>> (contact at brice.xyz(mailto:contact at brice.xyz)) a écrit:
>>>
>>> >
>>> > Le 05/03/2019 à 23:40, Greg Ewing a écrit :
>>> > > Steven D'Aprano wrote:
>>> > >> The question is, is [recursive merge] behaviour useful enough and
>>> > > > common enough to be built into dict itself?
>>> > >
>>> > > I think not. It seems like just one possible way of merging
>>> > > values out of many. I think it would be better to provide
>>> > > a merge function or method that lets you specify a function
>>> > > for merging values.
>>> > >
>>> > That's what this conversation led me to. I'm not against the addition
>>> > for the most general usage (and current PEP's describes the behaviour I
>>> > would expect before reading the doc), but for all other more specific
>>> > usages, where we intend any special or not-so-common behaviour, I'd go
>>> > with modifying Dict.update like this:
>>> >
>>> > foo.update(bar, on_collision=updator) # Although I'm not a fan of the
>>> > keyword I used
>>>
>>> Le 6 mars 2019 à 10:26:15, Brice Parent
>>> (contact at brice.xyz(mailto:contact at brice.xyz)) a écrit:
>>>
>>> >
>>> > Le 05/03/2019 à 23:40, Greg Ewing a écrit :
>>> > > Steven D'Aprano wrote:
>>> > >> The question is, is [recursive merge] behaviour useful enough and
>>> > > > common enough to be built into dict itself?
>>> > >
>>> > > I think not. It seems like just one possible way of merging
>>> > > values out of many. I think it would be better to provide
>>> > > a merge function or method that lets you specify a function
>>> > > for merging values.
>>> > >
>>> > That's what this conversation led me to. I'm not against the addition
>>> > for the most general usage (and current PEP's describes the behaviour I
>>> > would expect before reading the doc), but for all other more specific
>>> > usages, where we intend any special or not-so-common behaviour, I'd go
>>> > with modifying Dict.update like this:
>>> >
>>> > foo.update(bar, on_collision=updator) # Although I'm not a fan of the
>>> > keyword I used
>>>
>>> This won’t be possible update() already takes keyword arguments:
>>>
>>> >>> foo = {}
>>> >>> bar = {'a': 1}
>>> >>> foo.update(bar, on_collision=lambda e: e)
>>> >>> foo
>>> {'a': 1, 'on_collision': <function <lambda> at 0x10b8df598>}
>>>
>>> > `updator` being a simple function like this one:
>>> >
>>> > def updator(updated, updator, key) -> Any:
>>> > if key == "related":
>>> > return updated[key].update(updator[key])
>>> >
>>> > if key == "tags":
>>> > return updated[key] + updator[key]
>>> >
>>> > if key in ["a", "b", "c"]: # Those
>>> > return updated[key]
>>> >
>>> > return updator[key]
>>> >
>>> > There's nothing here that couldn't be made today by using a custom
>>> > update function, but leaving the burden of checking for values that are
>>> > in both and actually inserting the new values to Python's language, and
>>> > keeping on our side only the parts that are specific to our use case,
>>> > makes in my opinion the code more readable, with fewer possible bugs
>>> and
>>> > possibly better optimization.
>>> >
>>> >
>>> > _______________________________________________
>>> > Python-ideas mailing list
>>> > Python-ideas at python.org
>>> > https://mail.python.org/mailman/listinfo/python-ideas
>>> > Code of Conduct: http://python.org/psf/codeofconduct/
>>> _______________________________________________
>>> Python-ideas mailing list
>>> Python-ideas at python.org
>>> https://mail.python.org/mailman/listinfo/python-ideas
>>> Code of Conduct: http://python.org/psf/codeofconduct/
>>>
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> https://mail.python.org/mailman/listinfo/python-ideas
>> Code of Conduct: http://python.org/psf/codeofconduct/
>>
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20190306/8d153f6e/attachment-0001.html>


More information about the Python-ideas mailing list