[Python-ideas] Dict joining using + and +=

Raymond Hettinger raymond.hettinger at gmail.com
Tue Mar 5 00:52:04 EST 2019

> On Mar 4, 2019, at 11:24 AM, Guido van Rossum <guido at python.org> wrote:
> * Regarding how often this is needed, we know that this is proposed and discussed at length every few years, so I think this will fill a real need.

I'm not sure that conclusion follows from the premise :-)  Some ideas get proposed routinely because they are obvious things to propose, not because people actually need them.  One hint is that the proposals always have generic variable names, "d = d1 + d2", and another is that they are almost never accompanied by actual use cases or real code that would be made better. I haven't seen anyone in this thread say they would use this more than once a year or that their existing code was unclear or inefficient in any way.  The lack of dict addition support in other languages (like Java example) is another indicator that there isn't a real need -- afaict there is nothing about Python that would cause us to have a unique requirement that other languages don't have.

FWIW, there are some downsides to the proposal -- it diminishes some of the unifying ideas about Python that I typically present on the first day of class:

* One notion is that the APIs nudge users toward good code.  The "copy.copy()" function has to be imported -- that minor nuisance is a subtle hint that copying isn't good for you.  Likewise for dicts, writing "e=d.copy(); e.update(f)" is a minor nuisance that either serves to dissuade people from unnecessary copying or at least will make very clear what is happening.  The original motivating use case for ChainMap() was to make a copy free replacement for excessively slow dict additions in ConfigParser.  Giving a plus-operator to mappings is an invitation to writing code that doesn't scale well.

* Another unifying notion is that the star-operator represents repeat addition across multiple data types.  It is a nice demo to show that "a * 5 == a + a + a + a + a" where "a" is an int, float, complex, str, bytes, tuple, or list.  Giving __add__() to dicts breaks this pattern.

* When teaching dunder methods, the usual advice regarding operators is to use them only when their meaning is unequivocal; otherwise, have a preference for named methods where the method name clarifies what is being done -- don't use train+car to mean train.shunt_to_middle(car). For dicts that would mean not having the plus-operator implement something that isn't inherently additive (it applies replace/overwrite logic instead), that isn't commutative, and that isn't linear when applied in succession (d1+d2+d3).

* In the advanced class where C extensions are covered, the organization of the slots is shown as a guide to which methods make sense together: tp_as_number, tp_as_sequence, and tp_as_mapping.  For dicts to gain the requisite methods, they will have to become numbers (in the sense of filling out the tp_as_number slots).  That will slow down the abstract methods that search the slot groups, skipping over groups marked as NULL.  It also exposes method groups that don't typically appear together, blurring their distinction.

* Lastly, there is a vague piece of zen-style advice, "if many things in the language have to change to implement idea X, it stops being worth it".   In this case, it means that every dict-like API and the related abstract methods and typing equivalents would need to grow support for addition in mappings (would it even make sense to add to shelve objects or os.environ objects together?)

That's my two cents worth.  I'm ducking out now (nothing more to offer on the subject). Guido's participation in the thread has given it an air of inevitability so this post will likely not make a difference.


More information about the Python-ideas mailing list