On Sat, Mar 02, 2019 at 11:14:18AM -0800, Raymond Hettinger wrote:
If the existing code were in the form of "d=e.copy(); d.update(f); d.update(g); d.update(h)", converting it to "d = e + f + g + h" would be a tempting but algorithmically poor thing to do (because the behavior is quadratic).
I mention this in the PEP. Unlike strings, but like lists and tuples, I don't expect that this will be a problem in practice:
- it's easy to put repeated string concatenation in a tight loop; it is harder to think of circumstances where one needs to concatenate lists or tuples, or merge dicts, in a tight loop;
- it's easy to have situations where one is concatenating thousands of strings; its harder to imagine circumstances where one would be merging more than three or four dicts;
- concatentation s1 + s2 + ... for strings, lists or tuples results in a new object of length equal to the sum of the lengths of each of the inputs, so the output is constantly growing; but merging dicts d1 + d2 + ... typically results in a smaller object of length equal to the number of unique keys.
Most likely, the right thing to do would be "d = ChainMap(e, f, g, h)" for a zero-copy solution or "d = dict(ChainMap(e, f, g, h))" to flatten the result without incurring quadratic costs. Both of those are short and clear.
And both result in the opposite behaviour of what you probably intended if you were trying to match e + f + g + h. Dict merging/updating operates on "last seen wins", but ChainMap is "first seen wins". To get the same behaviour, we have to write the dicts in opposite order compared to update, from most to least specific:
# least specific to most specific prefs = site_defaults + user_defaults + document_prefs
# most specific to least prefs = dict(ChainMap(document_prefs, user_defaults, site_defaults))
To me, the later feels backwards: I'm applying document prefs first, and then trusting that the ChainMap doesn't overwrite them with the defaults. I know that's guaranteed behaviour, but every time I read it I'll feel the need to check :-)
Lastly, I'm still bugged by use of the + operator for replace-logic instead of additive-logic. With numbers and lists and Counters, the plus operator creates a new object where all the contents of each operand contribute to the result. With dicts, some of the contents for the left operand get thrown-away. This doesn't seem like addition to me (IIRC that is also why sets have "|" instead of "+").
I'm on the fence here. Addition seems to be the most popular operator (it often gets requested) but you might be right that this is more like a union operation than concatenation or addition operation. MRAB also suggested this earlier.
One point in its favour is that + goes nicely with - but on the other hand, sets have | and - with no + and that isn't a problem.