On Thu, Oct 17, 2019 at 07:48:00AM +0000, Josh Rosenberg wrote: [...]
That's 100% wrong. You're mixing up the unpacking generalizations for dict literals with the limitations on keyword arguments to functions. {**d1, **d2} is guaranteed to accept dicts with any keys, on any implementation of Python.
Do you have a canonical reference for this? If so, we should update the PEP with that information.
Less critical, but still wrong, is the contention that "collections.Counter is a dict subclass that supports the + operator. There are no known examples of people having performance issues due to adding large numbers of Counters."
A couple examples of Counter merge issues: https://bugs.python.org/issue36380
Thanks for the link, but it doesn't seem to be relevant. There's no evidence in the bug report that this was an actual performance problem in real life code, only an enhancement request to make Counter faster based on a Big Oh analysis that it was doing too much work. The analysis was also based on a different scenario: adding a small dict to a large dict, rather than adding lots and lots of dicts.
https://stackoverflow.com/q/34407128/364696 Someone having the exact problem of "performance issues due to adding large numbers of Counters."
But that's precisely what they are *not* doing: at no point do they add Counters. They are manually merging them *in place* into a single dict. We don't see how they calculate their benchmarks, so have to take their word that they are measuring what they say they are measuring. (I haven't tried to replicate their results.) At least one person in the comments questions whether the poster is actually using Counters as they claim. It is suspicious to me that they say that it freezes their GUI, which seems very odd. The poster is avoiding the quadratic behaviour feared in this proposal by merging dicts in place rather than creating lots and lots of temporary dicts, and so even if it is a genuine real-world performance problem, it is unrelated to Counter addition.
On "lossiness", it says "Integer addition and concatenation are also lossy, in the sense of not being reversable: you cannot get back the two addends given only the sum. Two numbers add to give 356; what are the two numbers?"
The argument in the original thread was that, for c = a + b, on all existing types in Python (modulo floating point imprecision issues), knowing any two of a, b, or c was enough to determine the value of the remaining variable;
*shrug* For the sake of the discussion, let's say that I accept that this was the argument. Why is this supposed "lossless" property of addition important outside of arithmetic?
there were almost no cases (again, floating point terribleness excepted) in which there existed some value d != a for which d + b == c, where dict addition breaks that pattern, however arbitrary some people believe it to be. Only example I'm aware of where this is violated is collections.Counter
So the precedent is already set by Counter. I know that Raymond has come out in the past as being against this proposal, but ironically Counter's API is his design, including the use of plus. So far as I know, Raymond hasn't said that the Counter plus API is a mistake or that he regrets it. It's too late for 3.8, but this proposal for time addition (another much requested feature) is relevant: https://bugs.python.org/issue17267 Given 24 hour wrap-around semantics, time addition will also violate this "addition isn't lossy" principle. As any modulo arithmetic will also do, including fixed-size ints with wrap-around on overflow semantics. It is also violated by float (and Decimal), which you dismiss by calling it "terribleness". Terribleness or not, it exists, and so it is not true that numeric addition is necessarily reversable in Python. Rather, it is true that numeric addition for builtins is already "lossy". -- Steven