adding dictionaries
data:image/s3,"s3://crabby-images/aae51/aae51c22b6688bdfad340461c5612a190646b557" alt=""
Is there a good reason for not implementing the "+" operator for dict.update()? A = dict(a=1, b=1) B = dict(a=2, c=2) B += A B dict(a=1, b=1, c=2) That is B += A should be equivalent to B.update(A) It would be even better if there was also a regular "addition" operator that is equivalent to creating a shallow copy and then calling update(): C = A + B should equal to C = dict(A) C.update(B) (obviously not the same as C = B + A, but the "+" operator is not commutative for most operations) class NewDict(dict): def __add__(self, other): x = dict(self) x.update(other) return x def __iadd__(self, other): self.update(other) My apologies if this has been posted before but with a quick google search I could not see it; if it was, could you please point me to the thread? I assume this must be a design decision that has been made a long time ago, but it is not obvious to me why.
data:image/s3,"s3://crabby-images/6a9ad/6a9ad89a7f4504fbd33d703f493bf92e3c0cc9a9" alt=""
On Sun, Jul 27, 2014 at 09:34:16AM +1000, Alexander Heger wrote:
You're asking the wrong question. The burden is not on people to justify *not* adding new features, the burden is on somebody to justify adding them. Is there a good reason for implementing the + operator as dict.update? We can already write B.update(A), under what circumstances would you spell it B += A instead, and why?
That would be spelled C = dict(A, **B). I'd be more inclined to enhance the dict constructor and update methods so you can provide multiple arguments: dict(A, B, C, D) # Rather than A + B + C + D D.update(A, B, C) # Rather than D += A + B + C
I'm not sure it's so much a deliberate decision not to implement dictionary addition, as uncertainty as to what dictionary addition ought to mean. Given two dicts: A = {'a': 1, 'b': 1} B = {'a': 2, 'c': 2} I can think of at least four things that C = A + B could do: # add values, defaulting to 0 for missing keys C = {'a': 3, 'b': 1, 'c': 2} # add values, raising KeyError if there are missing keys # shallow copy of A, update with B C = {'a': 2, 'b': 1, 'c': 2} # shallow copy of A, insert keys from B only if not already in A C = {'a': 1, 'b': 1, 'c': 2} Except for the second one, I've come across people suggesting that each of the other three is the one and only obvious thing for A+B to do. -- Steven
data:image/s3,"s3://crabby-images/b51b6/b51b60359797f136b4a06f10bbd6eb42be611014" alt=""
On 27 July 2014 02:17, Steven D'Aprano <steve@pearwood.info> wrote:
One good reason is that people are still convinced "dict(A, **B)" makes some kind of sense. But really, we have collections.ChainMap, dict addition is confusing and there's already a PEP (python.org/dev/peps/pep-0448) that has a solution I prefer ({**A, **B}).
data:image/s3,"s3://crabby-images/6a9ad/6a9ad89a7f4504fbd33d703f493bf92e3c0cc9a9" alt=""
On Mon, Jul 28, 2014 at 06:26:13AM +0100, Joshua Landau wrote:
Explain please. dict(A, **B) makes perfect sense to me, and it works perfectly too. It's a normal constructor call, using the same syntax as any other function or method call. Are you suggesting that it does not make sense? -- Steven
data:image/s3,"s3://crabby-images/50002/500022e72e181f81bd7ca4720032dc237009a6a1" alt=""
On Tue, Jul 29, 2014 at 12:59:51AM +1000, Steven D'Aprano wrote:
It worked in Python 2, but Python 3 added code to explicitly prevent the kwargs mechanism from being abused by passing non-string keys. Effectively, the only reason it worked was due to a Python 2.x kwargs implementation detail. It took me a while to come to terms with this one too, it was really quite a nice hack. But that's all it ever was. The domain of valid keys accepted by **kwargs should never have exceeded the range supported by the language syntax for declaring keyword arguments. David
data:image/s3,"s3://crabby-images/6a9ad/6a9ad89a7f4504fbd33d703f493bf92e3c0cc9a9" alt=""
On Mon, Jul 28, 2014 at 03:33:06PM +0000, dw+python-ideas@hmmz.org wrote:
/face-palm Ah of course! You're right, using dict(A, **B) isn't general enough. I'm still inclined to prefer allowing update() to accept multiple arguments: a.update(b, c, d) rather than a += b + c + d which suggests that maybe there ought to be an updated() built-in, Let the bike-shedding begin: should such a thing be spelled ? new_dict = a + b + c + d Pros: + is short to type; subclasses can control the type of new_dict. Cons: dict addition isn't obvious. new_dict = updated(a, b, c, d) Pros: analogous to sort/sorted, reverse/reversed. Cons: another built-in; isn't very general, only applies to Mappings new_dict = a.updated(b, c, d) Pros: only applies to mappings, so it should be a method; subclasses can control the type of the new dict returned. Cons: easily confused with dict.update -- Steven
data:image/s3,"s3://crabby-images/ef1c2/ef1c2b0cd950cc4cbc0d26a5e2b8ae2dd6375afc" alt=""
On 07/28/2014 11:04 AM, Steven D'Aprano wrote:
To me, the constructor and update method should be as near alike as possible. So I think if it's done in the update method, it should also work in the constructor. And other type constructors, such as list, should work in similar ways as well. I'm not sure that going in this direction would be good in the long term.
I think it's more obvious. It only needs __add__ and __iadd__ methods to make it consistent with the list type. The cons is that somewhere someone could be catching TypeError to differentiate dict from other types while adding. But it's just as likely they are doing so in order to add them after a TypeError occurs. I think this added consistency between lists and dicts would be useful. But, Putting __add__ and __iadd__ methods on dicts seems like something that was probably discussed in length before, and I wonder what reasons where given for not doing it then. Cheers, Ron
data:image/s3,"s3://crabby-images/6a9ad/6a9ad89a7f4504fbd33d703f493bf92e3c0cc9a9" alt=""
On Mon, Jul 28, 2014 at 12:17:10PM -0500, Ron Adam wrote:
On 07/28/2014 11:04 AM, Steven D'Aprano wrote:
[...]
What I meant was that it wasn't obvious what dict1 + dict2 should do, not whether or not the __add__ method exists.
I think this added consistency between lists and dicts would be useful.
Lists and dicts aren't the same kind of object. I'm not sure it is helpful to force them to be consistent. Should list grow an update() method to make it consistent with dicts? How about setdefault()? As for being useful, useful for what? Useful how often? I'm sure that one could take any piece of code, no matter how obscure, and say it is useful *somewhere* :-) but the question is whether it is useful enough to be part of the language. I was wrong to earlier dismiss the OP's usecase for dict addition by suggestion dict(a, **b). Such a thing only works if all the keys of b are valid identifiers. But that doesn't mean that just because my shoot-from-the-hip response missed the target that we should conclude that dict addition solves an important problem or that + is the correct way to spell it. I'm still dubious that it's needed, but if it were, this is what I would prefer to see: * should be a Mapping method, not a top-level function; * should accept anything the dict constructor accepts, mappings or lists of (key,value) pairs as well as **kwargs; * my prefered name for this is now "merged" rather than "updated"; * it should return a new mapping, not modify in-place; * when called from a class, it should behave like a class method: MyMapping.merged(a, b, c) should return an instance of MyMapping; * but when called from an instance, it should behave like an instance method, with self included in the chain of mappings to merge: a.merged(b, c) rather than a.merged(a, b, c). I have a descriptor type which implements the behaviour from the last two bullet points, so from a technical standpoint it's not hard to implement this. But I can imagine a lot of push-back from the more conservative developers about adding a *fourth* method type (even if it is private) to the Python builtins, so it would take a really compelling use-case to justify adding a new method type and a new dict method. (Personally, I think this hybrid class/instance method type is far more useful than staticmethod, since I've actually used it in production code, but staticmethod isn't going away.) -- Steven
data:image/s3,"s3://crabby-images/d224a/d224ab3da731972caafa44e7a54f4f72b0b77e81" alt=""
On Monday, July 28, 2014 8:34 PM, Steven D'Aprano <steve@pearwood.info> wrote: [snip]
How is this different from a plain-old (builtin or normal) method?
<function union> This is the way methods have always worked (although the details of how they worked under the covers changed in 3.0, and before that when descriptors and new-style classes were added).
data:image/s3,"s3://crabby-images/6a9ad/6a9ad89a7f4504fbd33d703f493bf92e3c0cc9a9" alt=""
On Mon, Jul 28, 2014 at 11:15:44PM -0700, Andrew Barnert wrote:
I see I failed to explain clearly, sorry about that. With class methods, the method always receives the class as the first argument. Regardless of whether you write dict.fromkeys or {1:'a'}.fromkeys, the first argument is the class, dict. With instance methods, the method receives the instance. If you call it from a class, the method is "unbound" and you are responsible for providing the "self" argument. To me, this hypothetical merged() method sometimes feels like an alternative constructor, like fromkeys, and therefore best written as a class method, but sometimes like a regular method. Since it feels like a hybrid to me, I think a hybrid descriptor approach is best, but as I already said I can completely understand if conservative developers reject this idea. In the hybrid form I'm referring to, the first argument provided is the class when called from the class, and the instance when called from an instance. Imagine it written in pure Python like this: class dict: @hybridmethod def merged(this, *args, **kwargs): if isinstance(this, type): # Called from the class new = this() else: # Called from an instance. new = this.copy() for arg in args: new.update(arg) new.update(kwargs) return new If merged is a class method, we can avoid having to worry about the case where your "a" mapping happens to be a list of (key,item) pairs: a.merged(b, c, d) # Fails if a = [(key, item), ...] dict.merged(a, b, c, d) # Always succeeds. It also allows us to easily specify a different mapping type for the result: MyMapping.merged(a, b, c, d) although some would argue this is just as clear: MyMapping().merged(a, b, c, d) albeit perhaps not quite as efficient if MyMapping is expensive to instantiate. (You create an empty instance, only to throw it away again.) On the other hand, there are use-cases where merged() best communicates the intent if it is a regular instance method. Consider: settings = application_defaults.merged( global_settings, user_settings, commandline_settings) seems more clear to me than: settings = dict.merged( application_defaults, global_settings, user_settings, commandline_settings) especially in the case that application_defaults is a dict literal. tl;dr It's not often that I can't decide whether a method ought to be a class method or an instance method, the decision is usually easy, but this is one of those times. -- Steven
data:image/s3,"s3://crabby-images/dd6fd/dd6fd23c074893f35eb5a9c07e3cc0841489b2a9" alt=""
On 29.07.2014 15:35, Steven D'Aprano wrote:
[snip]
I really like the semantics of that. This allows for concise, and in my opinion, clearly readable code. Although I think maybe one should have two separate methods: the class method being called ``merged`` and the instance method called ``merged_with``. I find result = somedict.merged(b, c) somewhat less clear than result = somedict.merged_with(b, c) regards, jwi
data:image/s3,"s3://crabby-images/ef1c2/ef1c2b0cd950cc4cbc0d26a5e2b8ae2dd6375afc" alt=""
On 07/28/2014 10:34 PM, Steven D'Aprano wrote:
What else could it do besides return a new copy of dict1 updated with dict2 contents? It's an unordered container, so it wouldn't append, and the duplicate keys would be resolved based on the order of evaluation. I don't see any problem with that. I also don't know of any other obvious way to combine two dictionaries. The argument against it, may simply be that it's a feature by design, to have dictionaries unique enough so that code which handles them is clearly specific to them. I'm not sure how strong that logic is though.
Well, here is how they currently compare.
They do have quite a lot in common already. The usefulness of different types having the same methods is that external code can be less specific to the objects they handle. Of course, if those like methods act too differently they can be surprising as well. That may be the case if '+' and '+=' are used to update dictionaries, but then again, maybe not. (?)
That's where examples will have an advantage over an initial personal opinion. Not that initial opinions aren't useful at first to express support or non-support. I could have just used +1. ;-) Cheers, Ron
data:image/s3,"s3://crabby-images/6a9ad/6a9ad89a7f4504fbd33d703f493bf92e3c0cc9a9" alt=""
On Tue, Jul 29, 2014 at 06:12:16PM -0500, Ron Adam wrote on the similarity of lists and dicts: [...]
Now strip out the methods which are common to pretty much all objects, in other words just look at the ones which are common to mapping and sequence APIs but not to objects in general: {'copy', '__ge__', '__delitem__', '__getitem__', 'pop', '__gt__', 'clear', '__len__', '__le__', '__contains__', '__lt__', '__setitem__', '__iter__'} And now look a little more closely: - although dicts and lists both support order comparisons like > and <, you cannot compare a dict to a list in Python 3; - although dicts and lists both support a pop method, their signatures are different; x.pop() will fail if x is a dict, and x.pop(k, d) will fail if x is a list; - although both support membership testing "a in x", what is being tested is rather different; if x is a dict, then a must be a key, but the analog of keys for lists is the index, not the value. So the similarities between list and dict are: * both have a length * both are iterable * both support subscripting operations x[i] * although dicts don't support slicing x[i:j:k] * both support a copy() method * both support a clear() method That's not a really big set of operations in common, and they're rather general. The real test is, under what practical circumstances would you expect to freely substitute a list for a dict or visa versa, and what could you do with that object when you received it? For me, the only answer that comes readily to mind is that the dict constructor accepts either another dict or a list of (key,item) pairs. [...]
I don't think that it is reasonable to treat dicts and lists as having a lot in common. They have a little in common, by virtue of both being containers, but then a string bag and a 40ft steel shipping container are both containers too, so that doesn't imply much similarity :-) It seems to me that outside of utterly generic operations like iteration, conversion to string and so on, lists do not quack like dicts, and dicts do not swim like lists, in any significant sense. -- Steven
data:image/s3,"s3://crabby-images/eac55/eac5591fe952105aa6b0a522d87a8e612b813b5f" alt=""
On 29 Jul 2014 04:40, "Ryan Hiebert" <ryan@ryanhiebert.com> wrote:
On Jul 28, 2014, at 11:04 AM, Steven D'Aprano <steve@pearwood.info>
wrote:
Note that if update() was changed to accept multiple args, the dict() constructor could similarly be updated. Then: x = dict(a) x.update(b) x.update(c) x.update(d) Would become: x = dict(a, b, c, d) Aside from the general "What's the use case that wouldn't be better served by a larger scale refactoring?" concern, my main issue with that approach would be the asymmetry it would introduce with the set constructor (which disallows multiple arguments to avoid ambiguity in the single argument case). But really, I'm not seeing a compelling argument for why this needs to be a builtin. If someone is merging dicts often enough to care, they can already write a function to do the dict copy-and-update as a single operation. What makes this more special than the multitude of other three line functions in the world? Cheers, Nick.
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
In addition, dict(A, **B) is not something you easily stumble upon when your goal is "merge two dicts"; nor is it even clear that that's what it is when you read it for the first time. All signs of too-clever hacks in my book. On Mon, Jul 28, 2014 at 8:33 AM, <dw+python-ideas@hmmz.org> wrote:
-- --Guido van Rossum (python.org/~guido)
data:image/s3,"s3://crabby-images/aae51/aae51c22b6688bdfad340461c5612a190646b557" alt=""
On 29 July 2014 02:08, Guido van Rossum <guido@python.org> wrote:
I try to convince students to learn and *use* python. If I tell students to merge 2 dictionaries they have to do dict(A, **B} or {**A, **B} that seem less clear (not something you "stumble across" as Guidon says) than A + B; then we still have to tell them the rules of the operation, as usual for any operation. It does not have to be "+", could be the "union" operator "|" that is used for sets where s.update(t) is the same as s |= t ... and accordingly D = A | B | C Maybe this operator is better as this equivalence is already being used (for sets). Accordingly "union(A,B)" could do a merge operation and return the new dict(). (this then still allows people who want "+" to add the values be made happy in the long run) -Alexander
data:image/s3,"s3://crabby-images/d224a/d224ab3da731972caafa44e7a54f4f72b0b77e81" alt=""
On Jul 28, 2014, at 13:59, Alexander Heger <python@2sn.net> wrote:
The difference is that with sets, it (at least conceptually) doesn't matter whether you keep elements from s or t when they collide, because by definition they only collide if they're equal, but with dicts, it very much matters whether you keep items from s or t when their keys collide, because the corresponding values are generally _not_ equal. So this is a false analogy; the same problem raised in the first three replies on this thread still needs to be answered: Is it obvious that the values from b should overwrite the values from a (assuming that's the rule you're suggesting, since you didn't specify; translate to the appropriate question if you want a different rule) in all real-life use cases? If not, is this so useful that the benefits in some uses outweigh the almost certain confusion in others? Without a compelling "yes" to one of those two questions, we're still at square one here; switching from + to | and making an analogy with sets doesn't help.
Wouldn't you expect a top-level union function to take any two iterables and return the union of them as a set (especially given that set.union accepts any iterable for its non-self argument)? A.union(B) seems a lot better than union(A, B). Then again, A.updated(B) or updated?A, B) might be even better, as someone suggested, because the parallel between update and updated (and between e.g. sort and sorted) is not at all problematic.
data:image/s3,"s3://crabby-images/aae51/aae51c22b6688bdfad340461c5612a190646b557" alt=""
yes, one does have to deal with collisions and spell out a clear rule: same behaviour as update(). I was less uneasy about the | operator 1) it is already used the same way for collections.Counter [this is a quite strong constraint] 2) in shells it is used as "pipe" implying directionality - order matters yes, you are wondering whether the order should be this or that; you just *define* what it is, same as you do for subtraction. Another way of looking at it is to say that even in sets you take the second, but because they are identical it does not matter ;-) -Alexander
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
I'll regret jumping in here, but while dict(A, **B) as a way to merge two dicts A and B makes some sense, it has two drawbacks: (1) slow (creates an extra copy of B as it creates the keyword args structure for dict()) and (2) not general enough (doesn't support key types other than str). On Mon, Jul 28, 2014 at 7:59 AM, Steven D'Aprano <steve@pearwood.info> wrote:
-- --Guido van Rossum (python.org/~guido)
data:image/s3,"s3://crabby-images/e2594/e259423d3f20857071589262f2cb6e7688fbc5bf" alt=""
On 7/26/2014 7:34 PM, Alexander Heger wrote:
Is there a good reason for not implementing the "+" operator for dict.update()?
As you immediate noticed, this is an incoherent request as stated. A op B should be a new object.
Since "B op= A" is *defined* as resulting in B having the value of "B op A", with the operations possibly being done in-place if B is mutable, we would first have to define addition on dicts.
You have this backwards. Dict addition would have to come first, and there are multiple possible and contextually useful definitions. The idea of choosing anyone of them as '+' has been rejected. As indicated, augmented dict addition would follow from the choice of dict addition. It would not necessarily be equivalent to .update. The addition needed to make this true would be asymmetric, like catenation. But unlike sequence catenation, information is erased in that items in the updated dict get subtracted. Conceptually, update is replacement rather than just addition.
My apologies if this has been posted
Multiple dict additions have been proposed and discussed here on python-ideas and probably on python-list. -- Terry Jan Reedy
data:image/s3,"s3://crabby-images/aae51/aae51c22b6688bdfad340461c5612a190646b557" alt=""
Dear Terry,
I had set out wanting to have a short form for dict.update(), hence the apparently reversed order. The proposed full addition does the same after first making a shallow copy; the operator interface does define both __iadd__ and __add__.
yes. As I note, most uses of the "+" operator in Python are not symmetric (commutative).
Yes., not being able to have multiple identical keys is the nature of dictionaries. This does not mean that things should not be done in the best way they can be done. I was considering the set union operator "|" but that is also symmetric and may cause more confusion. Another consideration suggested was the element-wise addition in some form. This is the natural way of doing things for structures of fixed length like arrays, including numpy arrays. And this is being accepted. In contrast, for data structures with variable length, like lists and strings, "addition" is concatenation, and what I would see the most natural extension for dictionaries hence is to add the keys (not the key values or values to each other), with the common behavior to overwrite existing keys. You do have the choice in which order you write the operation. It would be funny if addition of strings would add their ASCII, char, or unicode values and return the resulting string. Sorry for bringing up, again, the old discussion of how to add dictionaries as part of this. -Alexander On 27 July 2014 11:27, Terry Reedy <tjreedy@udel.edu> wrote:
data:image/s3,"s3://crabby-images/d5dde/d5ddefb0e364dbc002295f965434b0b108e6eb27" alt=""
On Sat, Jul 26, 2014 at 7:34 PM, Alexander Heger <python@2sn.net> wrote:
Here are two threads that had some discussion of this: https://mail.python.org/pipermail/python-ideas/2011-December/013227.html and https://mail.python.org/pipermail/python-ideas/2013-June/021140.html. Seems like a useful feature if there could be a clean way to spell it. Cheers, Nathan
data:image/s3,"s3://crabby-images/8e91b/8e91bd2597e9c25a0a8c3497599699707003a9e9" alt=""
On 28 July 2014 19:58, Nathan Schneider <nathan@cmu.edu> wrote:
Here are two threads that had some discussion of this: https://mail.python.org/pipermail/python-ideas/2011-December/013227.html
This doesn't seem to have a use case, other than "it would be nice".
https://mail.python.org/pipermail/python-ideas/2013-June/021140.html.
This can be handled using ChainMap, if I understand the proposal.
Seems like a useful feature if there could be a clean way to spell it.
I've yet to see any real-world situation when I've wanted "dictionary addition" (with any of the various semantics proposed here) and I've never encountered a situation where using d1.update(d2) was sufficiently awkward that having an operator seemed reasonable. In all honesty, I'd suggest that code which looks bad enough to warrant even considering this feature is probably badly in need of refactoring, at which point the problem will likely go away. Paul
data:image/s3,"s3://crabby-images/d224a/d224ab3da731972caafa44e7a54f4f72b0b77e81" alt=""
On Jul 28, 2014, at 12:21, Paul Moore <p.f.moore@gmail.com> wrote:
When the underlying dicts and desired combined dict are all going to be used immutably, ChainMap is the perfect answer. (Better than an "updated" function for performance if nothing else.) And usually, when you're looking for a non-mutating combine-dicts operation, that will be what you want. But usually isn't always. If you want a snapshot of the combination of mutable dicts, ChainMap is wrong. If you want to be able to mutate the result, ChainMap is wrong. All that being said, I'm not sure these use cases are sufficiently common to warrant adding an operator--especially since there are other just-as-(un)common use cases it wouldn't solve. (For example, what I often want is a mutable "overlay" ChainMap, which doesn't need to copy the entire potentially-gigantic source dicts. I wouldn't expect an operator for that, even though I need it far more often than I need a mutable snapshot copy.) And of course, as you say, real-life use cases would be a lot more compelling than theoretical/abstract ones.
data:image/s3,"s3://crabby-images/ef9a3/ef9a3cb1fb9fd7a4920ec3c178eaddbb9c521a58" alt=""
On Mon, Jul 28, 2014 at 10:20 PM, Andrew Barnert <abarnert@yahoo.com.dmarc.invalid> wrote:
When the underlying dicts and desired combined dict are all going to be used immutably, ChainMap is the perfect answer. (Better than an "updated" function for performance if nothing else.) And usually, when you're looking for a non-mutating combine-dicts operation, that will be what you want.
But usually isn't always. If you want a snapshot of the combination of mutable dicts, ChainMap is wrong. If you want to be able to mutate the result, ChainMap is wrong.
In those cases, do dict(ChainMap(...)).
All that being said, I'm not sure these use cases are sufficiently common to warrant adding an operator--especially since there are other just-as-(un)common use cases it wouldn't solve. (For example, what I often want is a mutable "overlay" ChainMap, which doesn't need to copy the entire potentially-gigantic source dicts. I wouldn't expect an operator for that, even though I need it far more often than I need a mutable snapshot copy.)
And of course, as you say, real-life use cases would be a lot more compelling than theoretical/abstract ones.
data:image/s3,"s3://crabby-images/aae51/aae51c22b6688bdfad340461c5612a190646b557" alt=""
For many applications you may not care one way or the other, only for some you do, and only then you need to know the details of operation. My point is to make the dict() data structure more easy to use for most users and use cases. Especially novices. This is what adds power to the language. Not that you can do things (Turing machines can) but that you can do them easily and naturally.
data:image/s3,"s3://crabby-images/eac55/eac5591fe952105aa6b0a522d87a8e612b813b5f" alt=""
On 29 Jul 2014 08:22, "Alexander Heger" <python@2sn.net> wrote:
But why is dict merging into a *new* dict something that needs to be done as a single expression? What's the problem with spelling out "to merge two dicts into a new, first make a dict, then merge in the other one": x = dict(a) x.update(b) That's the real competitor here, not the more cryptic "x = dict(a, **b)" You can even use it as an example of factoring out a helper function: def copy_and_update(a, *args): x = dict(a) for arg in args: x.update(arg) return x My personal experience suggests that's a rare enough use case that it's fine to leave it as a trivial helper function that people can write if they need it. The teaching example isn't compelling, since in the teaching case, spelling out the steps is going to be necessary anyway to explain what the function or method call is actually doing. Cheers, Nick.
data:image/s3,"s3://crabby-images/aae51/aae51c22b6688bdfad340461c5612a190646b557" alt=""
it is more about having easy operations for people who learn Python for the sake of using it (besides, I teach science students not computer science students). The point is that it could be done in one operation. It seems like asking people to write a = 2 + 3 as a = int(2) a.add(3) Turing machine vs modern programming language. It does already work for Counters. The discussion seems to go such that because people can't agree whether the first or second occurrence of keys takes precedence, or what operator to use (already decided by the design of Counter) it is not done at all. To be fair, I am not a core Python programmer and am asking others to implement this - or maybe even agree it would be useful -, maybe pushing too much where just an idea should be floated. -Alexander
data:image/s3,"s3://crabby-images/d224a/d224ab3da731972caafa44e7a54f4f72b0b77e81" alt=""
On Jul 28, 2014, at 16:45, Alexander Heger <python@2sn.net> wrote:
Well, yeah, that happens a lot. An good idea that can't be turned into a concrete design that fits the language and makes everyone happy doesn't get added, unless it's so ridiculously compelling that nobody can imagine living without it. But that's not necessarily a bad thing--it's why Python is a relatively small and highly consistent language, which I think is a big part of why Python is so readable and teachable. Anyway, I think you're on to something with your idea of adding an updated or union or whatever function/method whose semantics are obvious, and then mapping the operators to that method and update. I can definitely buy that a.updated(b) or union(a, b) favors values from b for exactly the same reason a.update(b) does (although as I mentioned I have other problems with a union function). Meanwhile, if you have use cases for which ChainMap is not appropriate, you might want to write a dict subclass that you can use in your code or in teaching students or whatever, so you can amass some concrete use cases and show how much cleaner it is than the existing alternatives.
If it helps, if you can get everyone to agree on this, except that none of the core devs wants to do the work, I'll volunteer to write the C code (after I finish my io patch and my abc patch...), so you only have to add the test cases (which are easy Python code; the only hard part is deciding what to test) and the docs.
data:image/s3,"s3://crabby-images/aae51/aae51c22b6688bdfad340461c5612a190646b557" alt=""
I often want to call functions with added (or removed, replaced) keywords from the call. args0 = dict(...) args1 = dict(...) def f(**kwargs): g(**(arg0 | kwargs | args1)) currently I have to write args = dict(...) def f(**kwargs): temp_args = dict(dic0) temp_args.update(kwargs) temp_args.update(dic1) g(**temp_args) It would also make the proposed feature to allow multiple kw args expansions in Python 3.5 easy to write by having f(**a, **b, **c) be equivalent to f(**(a | b | c)) -Alexander
data:image/s3,"s3://crabby-images/aae51/aae51c22b6688bdfad340461c5612a190646b557" alt=""
yes, this (modify) is what I do. In any case, it would still be g(**collections.ChainMap(dict1, kwargs, dic0)) In either case a new dict is created and passed to g as kwargs. It's not pretty, but it does work. Thanks. so the general case D = A | B | C becomes D = dict(collections.ChainMap(C, B, A)) (someone may suggest dict could have a "chain" constructor class method D = dict.chain(C, B, A))
data:image/s3,"s3://crabby-images/8e91b/8e91bd2597e9c25a0a8c3497599699707003a9e9" alt=""
On 29 July 2014 00:04, Alexander Heger <python@2sn.net> wrote:
This immediately explains the key problem with this proposal. It never even *occurred* to me that anyone would expect C to take priority over A in the operator form. But the ChainMap form makes it immediately clear to me that this is the intent. An operator form will be nothing but a maintenance nightmare and a source of bugs. Thanks for making this obvious :-) -1. Paul
data:image/s3,"s3://crabby-images/dd6fd/dd6fd23c074893f35eb5a9c07e3cc0841489b2a9" alt=""
On 29.07.2014 08:22, Paul Moore wrote:
FWIW, one could use an operator which inherently shows a direction: << and >>, for both directions respectively. A = B >> C lets B take precedence, and A = B << C lets C take precedence. regards, jwi p.s.: I’m not entirely sure what to think about my suggestion---I’d like to hear opinions.
data:image/s3,"s3://crabby-images/8e91b/8e91bd2597e9c25a0a8c3497599699707003a9e9" alt=""
On 29 July 2014 12:56, Jonas Wielicki <j.wielicki@sotecware.net> wrote:
Personally, I don't like it much more than the symmetric-looking operators. I get your point, but it feels like you're just patching over a relatively small aspect of a fundamentally bad idea. But then again as I've already said, I see no need for any of this, the existing functionality seems fine to me. Paul
data:image/s3,"s3://crabby-images/d5dde/d5ddefb0e364dbc002295f965434b0b108e6eb27" alt=""
On Tue, Jul 29, 2014 at 7:56 AM, Jonas Wielicki <j.wielicki@sotecware.net> wrote:
If there is to be an operator devoted specifically to this, I like << and
as unambiguous choices. Proof: https://mail.python.org/pipermail/python-ideas/2011-December/013232.html :)
I am also partial to the {**A, **B} proposal in http://legacy.python.org/dev/peps/pep-0448/. Cheers, Nathan
data:image/s3,"s3://crabby-images/dd6fd/dd6fd23c074893f35eb5a9c07e3cc0841489b2a9" alt=""
On 30.07.2014 00:46, Greg Ewing wrote:
As already noted elsewhere (to continue playing devils advocate), its not an addition or union anyways. It’s not a union because it is lossy and not commutative it’s not something I’d call addition either. While one can certainly see it as shifting the elements from dict A over dict B. regards, jwi
data:image/s3,"s3://crabby-images/6a9ad/6a9ad89a7f4504fbd33d703f493bf92e3c0cc9a9" alt=""
On Tue, Jul 29, 2014 at 07:22:34AM +0100, Paul Moore wrote:
Hmmm. Funny you say that, because to me that is a major disadvantage of the ChainMap form: you have to write the arguments in reverse order. Suppose that we want to start with a, then override it with b, then override that with c. Since a is the start (the root, the base), we start with a, something like this: d = {} d.update(a) d.update(b) d.update(c) If update was chainable as it would be in Ruby: d.update(a).update(b).update(c) or even: d.update(a, b, c) This nicely leads us to d = a+b+c (assuming we agree that + meaning merge is the spelling we want). The ChainMap, on the other hand, works backwards from this perspective: the last dict to be merged has to be given first: ChainMap(c, b, a) -- Steven
data:image/s3,"s3://crabby-images/d224a/d224ab3da731972caafa44e7a54f4f72b0b77e81" alt=""
On Tuesday, July 29, 2014 7:36 AM, Steven D'Aprano <steve@pearwood.info> wrote:
I think that's pretty much exactly his point: To him, it's obvious that + should be in the order of ChainMap, and he can't even conceive of the possibility that you'd want it "backward". To you, it's obvious that + should be the other way around, and you find it annoying that ChainMap is "backward". Which seems to imply that any attempt at setting an order is going to not only seem backward, but possibly surprisingly so, to a subset of Python's users. And this is the kind of thing can lead to subtle bugs. If a and b _almost never_ have duplicate keys, but very rarely do, you won't catch the problem until you think to test for it. And if one order or the other is so obvious to you that you didn't even imagine anyone would ever implement the opposite order, you probably won't think to write the test until you have a bug in the field…
data:image/s3,"s3://crabby-images/eac55/eac5591fe952105aa6b0a522d87a8e612b813b5f" alt=""
On 30 July 2014 05:29, Andrew Barnert <abarnert@yahoo.com.dmarc.invalid> wrote:
I think this is a nice way of explaining the concern. I'll also note that, given we turned a whole pile of similarly subtle data driven bugs into structural type errors in the Python 3 transition, I'm not exactly enamoured of the idea of adding more :) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
data:image/s3,"s3://crabby-images/eac55/eac5591fe952105aa6b0a522d87a8e612b813b5f" alt=""
On 29 Jul 2014 08:16, "Alexander Heger" <python@2sn.net> wrote:
The first part of this one of the use cases for functools.partial(), so it isn't a compelling argument for easy dict merging. The above is largely an awkward way of spelling: import functools f = functools.partial(g, **...) The one difference is to also silently *override* some of the explicitly passed arguments, but that part's downright user hostile and shouldn't be encouraged. Regards, Nick.
data:image/s3,"s3://crabby-images/aae51/aae51c22b6688bdfad340461c5612a190646b557" alt=""
yes, poor example due to briefly. ;-) In my case f would actually do something with the values of kwargs before calling g, and args1 many not be static outside f. (hence partial is not a solution for the full application) def f(**kwargs): # do something with kwrags, create dict0 and dict1 using kwargs temp_args = dict(dict0) temp_args.update(kwargs) temp_args.update(dict1) g(**temp_args) # more uses of dict0 which could be def f(**kwargs): # do something with kwargs, create dict0 and dict1 using kwargs g(**collections.ChainMap(dict1, kwargs, dict0)) # more uses of dict0 Maybe good enough for that case, like with + or |, one still need to know/learn the lookup order for key replacement, and it is sort of bulky. -Alexander
data:image/s3,"s3://crabby-images/aae51/aae51c22b6688bdfad340461c5612a190646b557" alt=""
https://mail.python.org/pipermail/python-ideas/2013-June/021140.html.
I see, this is a very extended thread google did not show me when I started this one, and many good points were made there. So, my apologies I restarted this w/o reference; this discussion does seem to resurface, however. It seems it would be valuable to parallel the behaviour of operators already in place for collections. Counter: A + B adds values (calls __add__ or __iadd__ function of values, likely __iadd__ for values of A) A |= B does A.update(B) etc. -Alexander On 29 July 2014 05:21, Paul Moore <p.f.moore@gmail.com> wrote:
data:image/s3,"s3://crabby-images/b96f7/b96f788b988da8930539f76bf56bada135c1ba88" alt=""
Alexander Heger writes:
It seems it would be valuable to parallel the behaviour of operators already in place for collections.
Mappings aren't collections. In set theory, of course, they are represented as *appropriately restricted* collections, but the meaning of "+" as applied to mappings in mathematics varies. For functions on the same domain, there's usually an element-wise meaning that's applied. For functions on different domains, I've seen it used to mean "apply the appropriate function on the disjoint union of the domains". I don't think there's an obvious winner in the competition among the various meanings.
data:image/s3,"s3://crabby-images/b96f7/b96f788b988da8930539f76bf56bada135c1ba88" alt=""
Alexander Heger writes:
I mistyped. It should have read " ... the behaviour in place for collections.Counter"
But there *is* a *the* (ie, unique) "additive" behavior for Counter. (At least, I find it reasonable to think so.) What you're missing is that there is no such agreement on what it means to add dictionaries. True, you can "just pick one". Python doesn't much like to do that, though. The problem is that on discovering that dictionaries can be added, *everybody* is going to think that their personal application is the obvious one to implement as "+" and/or "+=". Some of them are going to be wrong and write buggy code as a consequence.
data:image/s3,"s3://crabby-images/eac55/eac5591fe952105aa6b0a522d87a8e612b813b5f" alt=""
On 29 July 2014 13:13, Stephen J. Turnbull <stephen@xemacs.org> wrote:
In fact, the existence of collections.Counter.__add__ is an argument *against* introducing dict.__add__ with different semantics: >>> issubclass(collections.Counter, dict) True So, if someone *wants* a dict with "addable" semantics, they can already use collections.Counter. While some of its methods really only work with integers, the addition part is actually usable with arbitrary addable types. If set-like semantics were added to dict, it would conflict with the existing element-wise semantics of Counter. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
data:image/s3,"s3://crabby-images/e2594/e259423d3f20857071589262f2cb6e7688fbc5bf" alt=""
On 7/28/2014 8:16 PM, Stephen J. Turnbull wrote:
This assumes the same range set (of addable items) also. If Python were to add d1 + d2 and d1 += d2, I think we should use this existing and most common definition and add values. The use cases are keyed collections of things that can be added, which are pretty common. Then dict addition would have the properties of the value addition. Example: Let sales be a mapping from salesperson to total sales (since whenever). Let sales_today be a mapping from saleperson to today's sales. Then sales = sales + sales_today, or sales += sales_today. I could, of course, do this today with class Sales(dict): with __add__, __iadd__, and probably other app-specific methods. The issue is that there are two ways to update a mapping with an update mapping: replace values and combine values. Addition combines, so to me, dict addition, if defined, should combine.
According to https://en.wikipedia.org/wiki/Disjoint_union, d_u has at least two meaning. -- Terry Jan Reedy
data:image/s3,"s3://crabby-images/b96f7/b96f788b988da8930539f76bf56bada135c1ba88" alt=""
Terry Reedy writes:
IMHO[1] that's way too special for the generic mapping types. If one wants such operations, she should define NumericValuedMapping and StringValuedMapping etc classes for each additive set of values.
Either meaning will do here, with the distinction that the set- theoretic meaning (which I intended) applies to any two functions, while the alternate meaning imposes a restriction on the functions that can be added (and therefore is inappropriate for this discussion IMHO). Footnotes: [1] I mean the "H", I'm no authority.
data:image/s3,"s3://crabby-images/600af/600af0bbcc432b8ca2fa4d01f09c63633eb2f1a7" alt=""
On Mon, Jul 28, 2014 at 5:16 PM, Stephen J. Turnbull <stephen@xemacs.org> wrote:
The former meaning requires that the member types support addition, so it's the obvious loser -- dicts can contain any kind of value, not just addable ones. Adding a method that only works if the values satisfy certain extra optional constraints is rare in Python, and needs justification over the alternatives. The second suggestion works just fine, you just need to figure out what to do with the intersection since we won't have disjoint domains. The obvious suggestion is to pick an ordering, just like the update method does. For another angle: the algorithms course I took in university introduced dictionaries as sets where the members of the set are tagged with values. This makes set-like operators obvious in meaning, with the only question being, again, what to do with the tags during collisions. (FWIW, the meaning of + as applied to sets is generally union -- but Python's set type uses | instead, presumably for analogy with ints when they are treated as a set of small integers). That said, the only reason I can think of to support this new stuff is to stop dict(x, **y) from being such an attractive nuisance. -- Devin
data:image/s3,"s3://crabby-images/6a9ad/6a9ad89a7f4504fbd33d703f493bf92e3c0cc9a9" alt=""
On Sun, Jul 27, 2014 at 09:34:16AM +1000, Alexander Heger wrote:
You're asking the wrong question. The burden is not on people to justify *not* adding new features, the burden is on somebody to justify adding them. Is there a good reason for implementing the + operator as dict.update? We can already write B.update(A), under what circumstances would you spell it B += A instead, and why?
That would be spelled C = dict(A, **B). I'd be more inclined to enhance the dict constructor and update methods so you can provide multiple arguments: dict(A, B, C, D) # Rather than A + B + C + D D.update(A, B, C) # Rather than D += A + B + C
I'm not sure it's so much a deliberate decision not to implement dictionary addition, as uncertainty as to what dictionary addition ought to mean. Given two dicts: A = {'a': 1, 'b': 1} B = {'a': 2, 'c': 2} I can think of at least four things that C = A + B could do: # add values, defaulting to 0 for missing keys C = {'a': 3, 'b': 1, 'c': 2} # add values, raising KeyError if there are missing keys # shallow copy of A, update with B C = {'a': 2, 'b': 1, 'c': 2} # shallow copy of A, insert keys from B only if not already in A C = {'a': 1, 'b': 1, 'c': 2} Except for the second one, I've come across people suggesting that each of the other three is the one and only obvious thing for A+B to do. -- Steven
data:image/s3,"s3://crabby-images/b51b6/b51b60359797f136b4a06f10bbd6eb42be611014" alt=""
On 27 July 2014 02:17, Steven D'Aprano <steve@pearwood.info> wrote:
One good reason is that people are still convinced "dict(A, **B)" makes some kind of sense. But really, we have collections.ChainMap, dict addition is confusing and there's already a PEP (python.org/dev/peps/pep-0448) that has a solution I prefer ({**A, **B}).
data:image/s3,"s3://crabby-images/6a9ad/6a9ad89a7f4504fbd33d703f493bf92e3c0cc9a9" alt=""
On Mon, Jul 28, 2014 at 06:26:13AM +0100, Joshua Landau wrote:
Explain please. dict(A, **B) makes perfect sense to me, and it works perfectly too. It's a normal constructor call, using the same syntax as any other function or method call. Are you suggesting that it does not make sense? -- Steven
data:image/s3,"s3://crabby-images/50002/500022e72e181f81bd7ca4720032dc237009a6a1" alt=""
On Tue, Jul 29, 2014 at 12:59:51AM +1000, Steven D'Aprano wrote:
It worked in Python 2, but Python 3 added code to explicitly prevent the kwargs mechanism from being abused by passing non-string keys. Effectively, the only reason it worked was due to a Python 2.x kwargs implementation detail. It took me a while to come to terms with this one too, it was really quite a nice hack. But that's all it ever was. The domain of valid keys accepted by **kwargs should never have exceeded the range supported by the language syntax for declaring keyword arguments. David
data:image/s3,"s3://crabby-images/6a9ad/6a9ad89a7f4504fbd33d703f493bf92e3c0cc9a9" alt=""
On Mon, Jul 28, 2014 at 03:33:06PM +0000, dw+python-ideas@hmmz.org wrote:
/face-palm Ah of course! You're right, using dict(A, **B) isn't general enough. I'm still inclined to prefer allowing update() to accept multiple arguments: a.update(b, c, d) rather than a += b + c + d which suggests that maybe there ought to be an updated() built-in, Let the bike-shedding begin: should such a thing be spelled ? new_dict = a + b + c + d Pros: + is short to type; subclasses can control the type of new_dict. Cons: dict addition isn't obvious. new_dict = updated(a, b, c, d) Pros: analogous to sort/sorted, reverse/reversed. Cons: another built-in; isn't very general, only applies to Mappings new_dict = a.updated(b, c, d) Pros: only applies to mappings, so it should be a method; subclasses can control the type of the new dict returned. Cons: easily confused with dict.update -- Steven
data:image/s3,"s3://crabby-images/ef1c2/ef1c2b0cd950cc4cbc0d26a5e2b8ae2dd6375afc" alt=""
On 07/28/2014 11:04 AM, Steven D'Aprano wrote:
To me, the constructor and update method should be as near alike as possible. So I think if it's done in the update method, it should also work in the constructor. And other type constructors, such as list, should work in similar ways as well. I'm not sure that going in this direction would be good in the long term.
I think it's more obvious. It only needs __add__ and __iadd__ methods to make it consistent with the list type. The cons is that somewhere someone could be catching TypeError to differentiate dict from other types while adding. But it's just as likely they are doing so in order to add them after a TypeError occurs. I think this added consistency between lists and dicts would be useful. But, Putting __add__ and __iadd__ methods on dicts seems like something that was probably discussed in length before, and I wonder what reasons where given for not doing it then. Cheers, Ron
data:image/s3,"s3://crabby-images/6a9ad/6a9ad89a7f4504fbd33d703f493bf92e3c0cc9a9" alt=""
On Mon, Jul 28, 2014 at 12:17:10PM -0500, Ron Adam wrote:
On 07/28/2014 11:04 AM, Steven D'Aprano wrote:
[...]
What I meant was that it wasn't obvious what dict1 + dict2 should do, not whether or not the __add__ method exists.
I think this added consistency between lists and dicts would be useful.
Lists and dicts aren't the same kind of object. I'm not sure it is helpful to force them to be consistent. Should list grow an update() method to make it consistent with dicts? How about setdefault()? As for being useful, useful for what? Useful how often? I'm sure that one could take any piece of code, no matter how obscure, and say it is useful *somewhere* :-) but the question is whether it is useful enough to be part of the language. I was wrong to earlier dismiss the OP's usecase for dict addition by suggestion dict(a, **b). Such a thing only works if all the keys of b are valid identifiers. But that doesn't mean that just because my shoot-from-the-hip response missed the target that we should conclude that dict addition solves an important problem or that + is the correct way to spell it. I'm still dubious that it's needed, but if it were, this is what I would prefer to see: * should be a Mapping method, not a top-level function; * should accept anything the dict constructor accepts, mappings or lists of (key,value) pairs as well as **kwargs; * my prefered name for this is now "merged" rather than "updated"; * it should return a new mapping, not modify in-place; * when called from a class, it should behave like a class method: MyMapping.merged(a, b, c) should return an instance of MyMapping; * but when called from an instance, it should behave like an instance method, with self included in the chain of mappings to merge: a.merged(b, c) rather than a.merged(a, b, c). I have a descriptor type which implements the behaviour from the last two bullet points, so from a technical standpoint it's not hard to implement this. But I can imagine a lot of push-back from the more conservative developers about adding a *fourth* method type (even if it is private) to the Python builtins, so it would take a really compelling use-case to justify adding a new method type and a new dict method. (Personally, I think this hybrid class/instance method type is far more useful than staticmethod, since I've actually used it in production code, but staticmethod isn't going away.) -- Steven
data:image/s3,"s3://crabby-images/d224a/d224ab3da731972caafa44e7a54f4f72b0b77e81" alt=""
On Monday, July 28, 2014 8:34 PM, Steven D'Aprano <steve@pearwood.info> wrote: [snip]
How is this different from a plain-old (builtin or normal) method?
<function union> This is the way methods have always worked (although the details of how they worked under the covers changed in 3.0, and before that when descriptors and new-style classes were added).
data:image/s3,"s3://crabby-images/6a9ad/6a9ad89a7f4504fbd33d703f493bf92e3c0cc9a9" alt=""
On Mon, Jul 28, 2014 at 11:15:44PM -0700, Andrew Barnert wrote:
I see I failed to explain clearly, sorry about that. With class methods, the method always receives the class as the first argument. Regardless of whether you write dict.fromkeys or {1:'a'}.fromkeys, the first argument is the class, dict. With instance methods, the method receives the instance. If you call it from a class, the method is "unbound" and you are responsible for providing the "self" argument. To me, this hypothetical merged() method sometimes feels like an alternative constructor, like fromkeys, and therefore best written as a class method, but sometimes like a regular method. Since it feels like a hybrid to me, I think a hybrid descriptor approach is best, but as I already said I can completely understand if conservative developers reject this idea. In the hybrid form I'm referring to, the first argument provided is the class when called from the class, and the instance when called from an instance. Imagine it written in pure Python like this: class dict: @hybridmethod def merged(this, *args, **kwargs): if isinstance(this, type): # Called from the class new = this() else: # Called from an instance. new = this.copy() for arg in args: new.update(arg) new.update(kwargs) return new If merged is a class method, we can avoid having to worry about the case where your "a" mapping happens to be a list of (key,item) pairs: a.merged(b, c, d) # Fails if a = [(key, item), ...] dict.merged(a, b, c, d) # Always succeeds. It also allows us to easily specify a different mapping type for the result: MyMapping.merged(a, b, c, d) although some would argue this is just as clear: MyMapping().merged(a, b, c, d) albeit perhaps not quite as efficient if MyMapping is expensive to instantiate. (You create an empty instance, only to throw it away again.) On the other hand, there are use-cases where merged() best communicates the intent if it is a regular instance method. Consider: settings = application_defaults.merged( global_settings, user_settings, commandline_settings) seems more clear to me than: settings = dict.merged( application_defaults, global_settings, user_settings, commandline_settings) especially in the case that application_defaults is a dict literal. tl;dr It's not often that I can't decide whether a method ought to be a class method or an instance method, the decision is usually easy, but this is one of those times. -- Steven
data:image/s3,"s3://crabby-images/dd6fd/dd6fd23c074893f35eb5a9c07e3cc0841489b2a9" alt=""
On 29.07.2014 15:35, Steven D'Aprano wrote:
[snip]
I really like the semantics of that. This allows for concise, and in my opinion, clearly readable code. Although I think maybe one should have two separate methods: the class method being called ``merged`` and the instance method called ``merged_with``. I find result = somedict.merged(b, c) somewhat less clear than result = somedict.merged_with(b, c) regards, jwi
data:image/s3,"s3://crabby-images/ef1c2/ef1c2b0cd950cc4cbc0d26a5e2b8ae2dd6375afc" alt=""
On 07/28/2014 10:34 PM, Steven D'Aprano wrote:
What else could it do besides return a new copy of dict1 updated with dict2 contents? It's an unordered container, so it wouldn't append, and the duplicate keys would be resolved based on the order of evaluation. I don't see any problem with that. I also don't know of any other obvious way to combine two dictionaries. The argument against it, may simply be that it's a feature by design, to have dictionaries unique enough so that code which handles them is clearly specific to them. I'm not sure how strong that logic is though.
Well, here is how they currently compare.
They do have quite a lot in common already. The usefulness of different types having the same methods is that external code can be less specific to the objects they handle. Of course, if those like methods act too differently they can be surprising as well. That may be the case if '+' and '+=' are used to update dictionaries, but then again, maybe not. (?)
That's where examples will have an advantage over an initial personal opinion. Not that initial opinions aren't useful at first to express support or non-support. I could have just used +1. ;-) Cheers, Ron
data:image/s3,"s3://crabby-images/6a9ad/6a9ad89a7f4504fbd33d703f493bf92e3c0cc9a9" alt=""
On Tue, Jul 29, 2014 at 06:12:16PM -0500, Ron Adam wrote on the similarity of lists and dicts: [...]
Now strip out the methods which are common to pretty much all objects, in other words just look at the ones which are common to mapping and sequence APIs but not to objects in general: {'copy', '__ge__', '__delitem__', '__getitem__', 'pop', '__gt__', 'clear', '__len__', '__le__', '__contains__', '__lt__', '__setitem__', '__iter__'} And now look a little more closely: - although dicts and lists both support order comparisons like > and <, you cannot compare a dict to a list in Python 3; - although dicts and lists both support a pop method, their signatures are different; x.pop() will fail if x is a dict, and x.pop(k, d) will fail if x is a list; - although both support membership testing "a in x", what is being tested is rather different; if x is a dict, then a must be a key, but the analog of keys for lists is the index, not the value. So the similarities between list and dict are: * both have a length * both are iterable * both support subscripting operations x[i] * although dicts don't support slicing x[i:j:k] * both support a copy() method * both support a clear() method That's not a really big set of operations in common, and they're rather general. The real test is, under what practical circumstances would you expect to freely substitute a list for a dict or visa versa, and what could you do with that object when you received it? For me, the only answer that comes readily to mind is that the dict constructor accepts either another dict or a list of (key,item) pairs. [...]
I don't think that it is reasonable to treat dicts and lists as having a lot in common. They have a little in common, by virtue of both being containers, but then a string bag and a 40ft steel shipping container are both containers too, so that doesn't imply much similarity :-) It seems to me that outside of utterly generic operations like iteration, conversion to string and so on, lists do not quack like dicts, and dicts do not swim like lists, in any significant sense. -- Steven
data:image/s3,"s3://crabby-images/eac55/eac5591fe952105aa6b0a522d87a8e612b813b5f" alt=""
On 29 Jul 2014 04:40, "Ryan Hiebert" <ryan@ryanhiebert.com> wrote:
On Jul 28, 2014, at 11:04 AM, Steven D'Aprano <steve@pearwood.info>
wrote:
Note that if update() was changed to accept multiple args, the dict() constructor could similarly be updated. Then: x = dict(a) x.update(b) x.update(c) x.update(d) Would become: x = dict(a, b, c, d) Aside from the general "What's the use case that wouldn't be better served by a larger scale refactoring?" concern, my main issue with that approach would be the asymmetry it would introduce with the set constructor (which disallows multiple arguments to avoid ambiguity in the single argument case). But really, I'm not seeing a compelling argument for why this needs to be a builtin. If someone is merging dicts often enough to care, they can already write a function to do the dict copy-and-update as a single operation. What makes this more special than the multitude of other three line functions in the world? Cheers, Nick.
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
In addition, dict(A, **B) is not something you easily stumble upon when your goal is "merge two dicts"; nor is it even clear that that's what it is when you read it for the first time. All signs of too-clever hacks in my book. On Mon, Jul 28, 2014 at 8:33 AM, <dw+python-ideas@hmmz.org> wrote:
-- --Guido van Rossum (python.org/~guido)
data:image/s3,"s3://crabby-images/aae51/aae51c22b6688bdfad340461c5612a190646b557" alt=""
On 29 July 2014 02:08, Guido van Rossum <guido@python.org> wrote:
I try to convince students to learn and *use* python. If I tell students to merge 2 dictionaries they have to do dict(A, **B} or {**A, **B} that seem less clear (not something you "stumble across" as Guidon says) than A + B; then we still have to tell them the rules of the operation, as usual for any operation. It does not have to be "+", could be the "union" operator "|" that is used for sets where s.update(t) is the same as s |= t ... and accordingly D = A | B | C Maybe this operator is better as this equivalence is already being used (for sets). Accordingly "union(A,B)" could do a merge operation and return the new dict(). (this then still allows people who want "+" to add the values be made happy in the long run) -Alexander
data:image/s3,"s3://crabby-images/d224a/d224ab3da731972caafa44e7a54f4f72b0b77e81" alt=""
On Jul 28, 2014, at 13:59, Alexander Heger <python@2sn.net> wrote:
The difference is that with sets, it (at least conceptually) doesn't matter whether you keep elements from s or t when they collide, because by definition they only collide if they're equal, but with dicts, it very much matters whether you keep items from s or t when their keys collide, because the corresponding values are generally _not_ equal. So this is a false analogy; the same problem raised in the first three replies on this thread still needs to be answered: Is it obvious that the values from b should overwrite the values from a (assuming that's the rule you're suggesting, since you didn't specify; translate to the appropriate question if you want a different rule) in all real-life use cases? If not, is this so useful that the benefits in some uses outweigh the almost certain confusion in others? Without a compelling "yes" to one of those two questions, we're still at square one here; switching from + to | and making an analogy with sets doesn't help.
Wouldn't you expect a top-level union function to take any two iterables and return the union of them as a set (especially given that set.union accepts any iterable for its non-self argument)? A.union(B) seems a lot better than union(A, B). Then again, A.updated(B) or updated?A, B) might be even better, as someone suggested, because the parallel between update and updated (and between e.g. sort and sorted) is not at all problematic.
data:image/s3,"s3://crabby-images/aae51/aae51c22b6688bdfad340461c5612a190646b557" alt=""
yes, one does have to deal with collisions and spell out a clear rule: same behaviour as update(). I was less uneasy about the | operator 1) it is already used the same way for collections.Counter [this is a quite strong constraint] 2) in shells it is used as "pipe" implying directionality - order matters yes, you are wondering whether the order should be this or that; you just *define* what it is, same as you do for subtraction. Another way of looking at it is to say that even in sets you take the second, but because they are identical it does not matter ;-) -Alexander
data:image/s3,"s3://crabby-images/3c3b2/3c3b2a6eec514cc32680936fa4e74059574d2631" alt=""
I'll regret jumping in here, but while dict(A, **B) as a way to merge two dicts A and B makes some sense, it has two drawbacks: (1) slow (creates an extra copy of B as it creates the keyword args structure for dict()) and (2) not general enough (doesn't support key types other than str). On Mon, Jul 28, 2014 at 7:59 AM, Steven D'Aprano <steve@pearwood.info> wrote:
-- --Guido van Rossum (python.org/~guido)
data:image/s3,"s3://crabby-images/e2594/e259423d3f20857071589262f2cb6e7688fbc5bf" alt=""
On 7/26/2014 7:34 PM, Alexander Heger wrote:
Is there a good reason for not implementing the "+" operator for dict.update()?
As you immediate noticed, this is an incoherent request as stated. A op B should be a new object.
Since "B op= A" is *defined* as resulting in B having the value of "B op A", with the operations possibly being done in-place if B is mutable, we would first have to define addition on dicts.
You have this backwards. Dict addition would have to come first, and there are multiple possible and contextually useful definitions. The idea of choosing anyone of them as '+' has been rejected. As indicated, augmented dict addition would follow from the choice of dict addition. It would not necessarily be equivalent to .update. The addition needed to make this true would be asymmetric, like catenation. But unlike sequence catenation, information is erased in that items in the updated dict get subtracted. Conceptually, update is replacement rather than just addition.
My apologies if this has been posted
Multiple dict additions have been proposed and discussed here on python-ideas and probably on python-list. -- Terry Jan Reedy
data:image/s3,"s3://crabby-images/aae51/aae51c22b6688bdfad340461c5612a190646b557" alt=""
Dear Terry,
I had set out wanting to have a short form for dict.update(), hence the apparently reversed order. The proposed full addition does the same after first making a shallow copy; the operator interface does define both __iadd__ and __add__.
yes. As I note, most uses of the "+" operator in Python are not symmetric (commutative).
Yes., not being able to have multiple identical keys is the nature of dictionaries. This does not mean that things should not be done in the best way they can be done. I was considering the set union operator "|" but that is also symmetric and may cause more confusion. Another consideration suggested was the element-wise addition in some form. This is the natural way of doing things for structures of fixed length like arrays, including numpy arrays. And this is being accepted. In contrast, for data structures with variable length, like lists and strings, "addition" is concatenation, and what I would see the most natural extension for dictionaries hence is to add the keys (not the key values or values to each other), with the common behavior to overwrite existing keys. You do have the choice in which order you write the operation. It would be funny if addition of strings would add their ASCII, char, or unicode values and return the resulting string. Sorry for bringing up, again, the old discussion of how to add dictionaries as part of this. -Alexander On 27 July 2014 11:27, Terry Reedy <tjreedy@udel.edu> wrote:
data:image/s3,"s3://crabby-images/d5dde/d5ddefb0e364dbc002295f965434b0b108e6eb27" alt=""
On Sat, Jul 26, 2014 at 7:34 PM, Alexander Heger <python@2sn.net> wrote:
Here are two threads that had some discussion of this: https://mail.python.org/pipermail/python-ideas/2011-December/013227.html and https://mail.python.org/pipermail/python-ideas/2013-June/021140.html. Seems like a useful feature if there could be a clean way to spell it. Cheers, Nathan
data:image/s3,"s3://crabby-images/8e91b/8e91bd2597e9c25a0a8c3497599699707003a9e9" alt=""
On 28 July 2014 19:58, Nathan Schneider <nathan@cmu.edu> wrote:
Here are two threads that had some discussion of this: https://mail.python.org/pipermail/python-ideas/2011-December/013227.html
This doesn't seem to have a use case, other than "it would be nice".
https://mail.python.org/pipermail/python-ideas/2013-June/021140.html.
This can be handled using ChainMap, if I understand the proposal.
Seems like a useful feature if there could be a clean way to spell it.
I've yet to see any real-world situation when I've wanted "dictionary addition" (with any of the various semantics proposed here) and I've never encountered a situation where using d1.update(d2) was sufficiently awkward that having an operator seemed reasonable. In all honesty, I'd suggest that code which looks bad enough to warrant even considering this feature is probably badly in need of refactoring, at which point the problem will likely go away. Paul
data:image/s3,"s3://crabby-images/d224a/d224ab3da731972caafa44e7a54f4f72b0b77e81" alt=""
On Jul 28, 2014, at 12:21, Paul Moore <p.f.moore@gmail.com> wrote:
When the underlying dicts and desired combined dict are all going to be used immutably, ChainMap is the perfect answer. (Better than an "updated" function for performance if nothing else.) And usually, when you're looking for a non-mutating combine-dicts operation, that will be what you want. But usually isn't always. If you want a snapshot of the combination of mutable dicts, ChainMap is wrong. If you want to be able to mutate the result, ChainMap is wrong. All that being said, I'm not sure these use cases are sufficiently common to warrant adding an operator--especially since there are other just-as-(un)common use cases it wouldn't solve. (For example, what I often want is a mutable "overlay" ChainMap, which doesn't need to copy the entire potentially-gigantic source dicts. I wouldn't expect an operator for that, even though I need it far more often than I need a mutable snapshot copy.) And of course, as you say, real-life use cases would be a lot more compelling than theoretical/abstract ones.
data:image/s3,"s3://crabby-images/ef9a3/ef9a3cb1fb9fd7a4920ec3c178eaddbb9c521a58" alt=""
On Mon, Jul 28, 2014 at 10:20 PM, Andrew Barnert <abarnert@yahoo.com.dmarc.invalid> wrote:
When the underlying dicts and desired combined dict are all going to be used immutably, ChainMap is the perfect answer. (Better than an "updated" function for performance if nothing else.) And usually, when you're looking for a non-mutating combine-dicts operation, that will be what you want.
But usually isn't always. If you want a snapshot of the combination of mutable dicts, ChainMap is wrong. If you want to be able to mutate the result, ChainMap is wrong.
In those cases, do dict(ChainMap(...)).
All that being said, I'm not sure these use cases are sufficiently common to warrant adding an operator--especially since there are other just-as-(un)common use cases it wouldn't solve. (For example, what I often want is a mutable "overlay" ChainMap, which doesn't need to copy the entire potentially-gigantic source dicts. I wouldn't expect an operator for that, even though I need it far more often than I need a mutable snapshot copy.)
And of course, as you say, real-life use cases would be a lot more compelling than theoretical/abstract ones.
data:image/s3,"s3://crabby-images/aae51/aae51c22b6688bdfad340461c5612a190646b557" alt=""
For many applications you may not care one way or the other, only for some you do, and only then you need to know the details of operation. My point is to make the dict() data structure more easy to use for most users and use cases. Especially novices. This is what adds power to the language. Not that you can do things (Turing machines can) but that you can do them easily and naturally.
data:image/s3,"s3://crabby-images/eac55/eac5591fe952105aa6b0a522d87a8e612b813b5f" alt=""
On 29 Jul 2014 08:22, "Alexander Heger" <python@2sn.net> wrote:
But why is dict merging into a *new* dict something that needs to be done as a single expression? What's the problem with spelling out "to merge two dicts into a new, first make a dict, then merge in the other one": x = dict(a) x.update(b) That's the real competitor here, not the more cryptic "x = dict(a, **b)" You can even use it as an example of factoring out a helper function: def copy_and_update(a, *args): x = dict(a) for arg in args: x.update(arg) return x My personal experience suggests that's a rare enough use case that it's fine to leave it as a trivial helper function that people can write if they need it. The teaching example isn't compelling, since in the teaching case, spelling out the steps is going to be necessary anyway to explain what the function or method call is actually doing. Cheers, Nick.
data:image/s3,"s3://crabby-images/aae51/aae51c22b6688bdfad340461c5612a190646b557" alt=""
it is more about having easy operations for people who learn Python for the sake of using it (besides, I teach science students not computer science students). The point is that it could be done in one operation. It seems like asking people to write a = 2 + 3 as a = int(2) a.add(3) Turing machine vs modern programming language. It does already work for Counters. The discussion seems to go such that because people can't agree whether the first or second occurrence of keys takes precedence, or what operator to use (already decided by the design of Counter) it is not done at all. To be fair, I am not a core Python programmer and am asking others to implement this - or maybe even agree it would be useful -, maybe pushing too much where just an idea should be floated. -Alexander
data:image/s3,"s3://crabby-images/d224a/d224ab3da731972caafa44e7a54f4f72b0b77e81" alt=""
On Jul 28, 2014, at 16:45, Alexander Heger <python@2sn.net> wrote:
Well, yeah, that happens a lot. An good idea that can't be turned into a concrete design that fits the language and makes everyone happy doesn't get added, unless it's so ridiculously compelling that nobody can imagine living without it. But that's not necessarily a bad thing--it's why Python is a relatively small and highly consistent language, which I think is a big part of why Python is so readable and teachable. Anyway, I think you're on to something with your idea of adding an updated or union or whatever function/method whose semantics are obvious, and then mapping the operators to that method and update. I can definitely buy that a.updated(b) or union(a, b) favors values from b for exactly the same reason a.update(b) does (although as I mentioned I have other problems with a union function). Meanwhile, if you have use cases for which ChainMap is not appropriate, you might want to write a dict subclass that you can use in your code or in teaching students or whatever, so you can amass some concrete use cases and show how much cleaner it is than the existing alternatives.
If it helps, if you can get everyone to agree on this, except that none of the core devs wants to do the work, I'll volunteer to write the C code (after I finish my io patch and my abc patch...), so you only have to add the test cases (which are easy Python code; the only hard part is deciding what to test) and the docs.
data:image/s3,"s3://crabby-images/aae51/aae51c22b6688bdfad340461c5612a190646b557" alt=""
I often want to call functions with added (or removed, replaced) keywords from the call. args0 = dict(...) args1 = dict(...) def f(**kwargs): g(**(arg0 | kwargs | args1)) currently I have to write args = dict(...) def f(**kwargs): temp_args = dict(dic0) temp_args.update(kwargs) temp_args.update(dic1) g(**temp_args) It would also make the proposed feature to allow multiple kw args expansions in Python 3.5 easy to write by having f(**a, **b, **c) be equivalent to f(**(a | b | c)) -Alexander
data:image/s3,"s3://crabby-images/aae51/aae51c22b6688bdfad340461c5612a190646b557" alt=""
yes, this (modify) is what I do. In any case, it would still be g(**collections.ChainMap(dict1, kwargs, dic0)) In either case a new dict is created and passed to g as kwargs. It's not pretty, but it does work. Thanks. so the general case D = A | B | C becomes D = dict(collections.ChainMap(C, B, A)) (someone may suggest dict could have a "chain" constructor class method D = dict.chain(C, B, A))
data:image/s3,"s3://crabby-images/8e91b/8e91bd2597e9c25a0a8c3497599699707003a9e9" alt=""
On 29 July 2014 00:04, Alexander Heger <python@2sn.net> wrote:
This immediately explains the key problem with this proposal. It never even *occurred* to me that anyone would expect C to take priority over A in the operator form. But the ChainMap form makes it immediately clear to me that this is the intent. An operator form will be nothing but a maintenance nightmare and a source of bugs. Thanks for making this obvious :-) -1. Paul
data:image/s3,"s3://crabby-images/dd6fd/dd6fd23c074893f35eb5a9c07e3cc0841489b2a9" alt=""
On 29.07.2014 08:22, Paul Moore wrote:
FWIW, one could use an operator which inherently shows a direction: << and >>, for both directions respectively. A = B >> C lets B take precedence, and A = B << C lets C take precedence. regards, jwi p.s.: I’m not entirely sure what to think about my suggestion---I’d like to hear opinions.
data:image/s3,"s3://crabby-images/8e91b/8e91bd2597e9c25a0a8c3497599699707003a9e9" alt=""
On 29 July 2014 12:56, Jonas Wielicki <j.wielicki@sotecware.net> wrote:
Personally, I don't like it much more than the symmetric-looking operators. I get your point, but it feels like you're just patching over a relatively small aspect of a fundamentally bad idea. But then again as I've already said, I see no need for any of this, the existing functionality seems fine to me. Paul
data:image/s3,"s3://crabby-images/d5dde/d5ddefb0e364dbc002295f965434b0b108e6eb27" alt=""
On Tue, Jul 29, 2014 at 7:56 AM, Jonas Wielicki <j.wielicki@sotecware.net> wrote:
If there is to be an operator devoted specifically to this, I like << and
as unambiguous choices. Proof: https://mail.python.org/pipermail/python-ideas/2011-December/013232.html :)
I am also partial to the {**A, **B} proposal in http://legacy.python.org/dev/peps/pep-0448/. Cheers, Nathan
data:image/s3,"s3://crabby-images/dd6fd/dd6fd23c074893f35eb5a9c07e3cc0841489b2a9" alt=""
On 30.07.2014 00:46, Greg Ewing wrote:
As already noted elsewhere (to continue playing devils advocate), its not an addition or union anyways. It’s not a union because it is lossy and not commutative it’s not something I’d call addition either. While one can certainly see it as shifting the elements from dict A over dict B. regards, jwi
data:image/s3,"s3://crabby-images/6a9ad/6a9ad89a7f4504fbd33d703f493bf92e3c0cc9a9" alt=""
On Tue, Jul 29, 2014 at 07:22:34AM +0100, Paul Moore wrote:
Hmmm. Funny you say that, because to me that is a major disadvantage of the ChainMap form: you have to write the arguments in reverse order. Suppose that we want to start with a, then override it with b, then override that with c. Since a is the start (the root, the base), we start with a, something like this: d = {} d.update(a) d.update(b) d.update(c) If update was chainable as it would be in Ruby: d.update(a).update(b).update(c) or even: d.update(a, b, c) This nicely leads us to d = a+b+c (assuming we agree that + meaning merge is the spelling we want). The ChainMap, on the other hand, works backwards from this perspective: the last dict to be merged has to be given first: ChainMap(c, b, a) -- Steven
participants (18)
-
Alexander Heger
-
Andrew Barnert
-
Antoine Pitrou
-
Devin Jeanpierre
-
dw+python-ideas@hmmz.org
-
Greg Ewing
-
Guido van Rossum
-
Jonas Wielicki
-
Joshua Landau
-
Nathan Schneider
-
Nick Coghlan
-
Paul Moore
-
Petr Viktorin
-
Ron Adam
-
Ryan Hiebert
-
Stephen J. Turnbull
-
Steven D'Aprano
-
Terry Reedy