[Python-ideas] PEP: Dict addition and subtraction

Steven D'Aprano steve at pearwood.info
Mon Mar 4 11:25:43 EST 2019


On Mon, Mar 04, 2019 at 10:09:32AM -0500, James Lu wrote:

> How many situations would you need to make a copy of a dictionary and 
> then update that copy and override old keys from a new dictionary?

Very frequently.

That's why we have a dict.update method, which if I remember correctly, 
was introduced in Python 1.5 because people were frequently re-inventing 
the same wheel:

    def update(d1, d2):
        for key in d2.keys():
            d1[key] in d2[key]


You should have a look at how many times it is used in the standard 
library:

[steve at ando cpython]$ cd Lib/
[steve at ando Lib]$ grep -U "\.update[(]" *.py */*.py | wc -l
373

Now some of those are false positives (docstrings, comments, non-dicts, 
etc) but that still leaves a lot of examples of wanting to override old 
keys. This is a very common need. Wanting an exception if the key 
already exists is, as far as I can tell, very rare.

It is true that many of the examples in the std lib involve updating an 
existing dict, not creating a new one. But that's only to be expected: 
since Python didn't provide an obvious functional version of update, 
only an in-place version, naturally people get used to writing 
in-place code.

(Think about how long we made do without sorted(). I don't know about 
other people, but I now find sorted indispensible, and probably use it 
ten or twenty times more often than the in-place version.)


[...]
> The KeyError of my proposal is a feature, a sign that something is 
> wrong, a sign an invariant is being violated.

Why is "keys are unique" an invariant?

The PEP gives a good example of when this "invariant" would be 
unnecessarily restrictive:

    For example, updating default configuration values with 
    user-supplied values would most often fail under the 
    requirement that keys are unique::

    prefs = site_defaults + user_defaults + document_prefs


Another example would be when reading command line options, where the 
most common convention is for "last option seen" to win:

[steve at ando Lib]$ grep --color=always --color=never "zero" f*.py
fileinput.py: numbers are zero; nextfile() has no effect.
fractions.py: # the same way for any finite a, so treat a as zero.
functools.py: # prevent their ref counts from going to zero during

and the output is printed without colour.

(I've slightly edited the above output so it will fit in the email 
without wrapping.)

The very name "update" should tell us that the most useful behaviour is 
the one the devs decided on back in 1.5: have the last seen value win. 
How can you update values if the operation raises an error if the key 
already exists? If this behaviour is ever useful, I would expect that it 
will be very rare.

An update or merge is effectively just running through a loop setting 
the value of a key. See the pre-Python 1.5 function above. Having update 
raise an exception if the key already exists would be about as useful as 
having ``d[key] = value`` raise an exception if the key already exists.

Unless someone can demonstrate that the design of dict.update() was a 
mistake, and the "require unique keys" behaviour is more common, then 
I maintain that for the very rare cases you want an exception, you can 
subclass dict and overload the __add__ method:

    # Intentionally simplified version.
    def __add__(self, other):
        if self.keys() & other.keys():
            raise KeyError
        return super().__add__(self, other)


> The ugliness of the syntax makes one pause 
> and think and ask: “Why is it important that the keys from this 
> dictionary override the ones from another dictionary?”

Because that is the most common and useful behaviour. That's what it 
means to *update* a dict or database, and this proposal is for an update 
operator.

The ugliness of the existing syntax is not a feature, it is a barrier.


-- 
Steven


More information about the Python-ideas mailing list