[Python-ideas] PEP: Dict addition and subtraction

Tue Mar 5 19:07:58 EST 2019

On Tue, Mar 5, 2019 at 3:50 PM Josh Rosenberg <
shadowranger+pythonideas at gmail.com> wrote:

>
> On Tue, Mar 5, 2019 at 11:16 PM Steven D'Aprano <steve at pearwood.info>
> wrote:
>
>> On Sun, Mar 03, 2019 at 09:28:30PM -0500, James Lu wrote:
>>
>> > I propose that the + sign merge two python dictionaries such that if
>> > there are conflicting keys, a KeyError is thrown.
>>
>> This proposal is for a simple, operator-based equivalent to
>> dict.update() which returns a new dict. dict.update has existed since
>> Python 1.5 (something like a quarter of a century!) and never grown a
>> "unique keys" version.
>>
>> I don't recall even seeing a request for such a feature. If such a
>> unique keys version is useful, I don't expect it will be useful often.
>>
>
> I have one argument in favor of such a feature: It preserves concatenation
> semantics. + means one of two things in all code I've ever seen (Python or
> otherwise):
>
> 1. Numeric addition (including element-wise numeric addition as in Counter
> and numpy arrays)
> 2. Concatenation (where the result preserves all elements, in order,
> including, among other guarantees, that len(seq1) + len(seq2) == len(seq1 +
> seq2))
>
> dict addition that didn't reject non-unique keys wouldn't fit *either*
> pattern; the main proposal (making it equivalent to left.copy(), followed
> by .update(right)) would have the left hand side would win on ordering, the
> right hand side on values, and wouldn't preserve the length invariant of
> concatenation. At least when repeated keys are rejected, most concatenation
> invariants are preserved; order is all of the left elements followed by all
> of the right, and no elements are lost.
>

I must by now have seen dozens of post complaining about this aspect of the
proposal. I think this is just making up rules (e.g. "+ never loses
information") to deal with an aspect of the design where a *choice* must be
made. This may reflect the Zen of Python's "In the face of ambiguity,
refuse the temptation to guess." But really, that's a pretty silly rule
(truly, they aren't all winners). Good interface design constantly makes
choices in ambiguous situations, because the alternative is constantly
asking, and that's just annoying.

We have a plethora of examples (in fact, almost all alternatives
considered) of situations related to dict merging where a choice is made
between conflicting values for a key, and it's always the value further to
the right that wins: from d[k] = v (which overrides the value when k is
already in the dict) to d1.update(d2) (which lets the values in d2 win),
including the much lauded {**d1, **d2} and even plain {'a': 1, 'a': 2} has
a well-defined meaning where the latter value wins.

As to why raising is worse: First, none of the other situations I listed
above raises for conflicts. Second, there's the experience of str+unicode
in Python 2, which raises if the str argument contains any non-ASCII bytes.
In fact, we disliked it so much that we changed the language incompatibly
to deal with it.

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20190305/c9fe9abc/attachment-0001.html>