[Python-ideas] Joining dicts again

Sun Feb 23 14:31:27 CET 2014

@Steven D'Aprano

> So we have *at least* four different ways to merge dictionaries a and b:
> 
> # 1: a wins
> c = b.copy()
> c.update(a)
> 
> 
> # 2: b wins
> c = a.copy()
> c.update(b)
> 
> 
> # 3: choose a winner according to the `or` operator
> c = a.copy()
> for key, value in b.items():
>     if key in c:
>         c[key] = c[key] or value
>     else:
>         c[key] = value
> 
> 
> # 4: keep both, in a list of 1 or 2 items
> c = {key:[value] for key, value in a.items()}
> for key, value in b.items():
>     if key in c and value != c[key][0]:
>         c[key].append(value)
>     else:
>         c[key] = [value]
> 
> 
> The first three are special cases of a more general case, where 
> you have a "decision function" that takes two values (one from dict a 
> and the other from dict b) and decides which one to keep. Case 1 ("a 
> always wins") would use `lambda x,y: x`, case 2 ("b wins") would use 
> `lambda x,y: y` and case 3 would use operator.or_.
> 
> The question is, why should any one of these be picked out as so 
> obviously more useful than the others as to deserve being a dict method 
> or operator support?
> 
> Steven

All solutions provided by you are not one-liners. Every requires at least 2 lines of code and is an imperative-style code block, instead of a simple expression.

I would really like to have a simple dict joining _expression_ that can be inserted everywhere I just need.

fun(dict_arg=(dict_a | dict_b))
fun(**(dict_a | dict_b))

I personally have no problem with any of your code samples being promoted to an operator.

My proposal is philosophically exactly the same as dropping the "print" _statement_ and replacing it with "print" _function_. It just merges more nicely with the rest of Python code.

Really, I would even be happy if we had at least a dict _method_ that returns the updated dict:

{'a':1}.update({'b':2}) # returns {'a':1, 'b':2}

@Mathias Panzenböck

> I never had a case where this kind of conflict resolution made sense. Can
> you show us an example?

Consider this example once again:

fun(**(dict_a | dict_b))

Function default parameters are often null-like values. I don't say that they always are, but if is common in the programming practice. The null values usually evaluate to boolean False in Python when converted.

Now consider the following:

Let's have a dict with some function's default parameters:
{'a':None, 'b':[]}

Now let's have two separate code blocks that fill those arguments with some computed values.

def funA(p):
 p['a'] = 1
 return p

def funB(p):
 p[b] = [1, 2, 3]
 return p

Again, this pattern is not uncommon in the programming practice. We often have some code blocks that try to fill as much parameters as possible, but not every of them.

Finally, we want to merge the dicts returned by these functions and provide them to the function of our interest.

Basically, we want something like that:

dict_default = {'a':None, 'b':[]}
dict_a = funA(dict_default.copy())
dict_b = funB(dict_default.copy())
dict_param = merge_using_or_resolution(dict_a, dict_b)
fun(**dict_param)

Quite a lot of code, as you see. It also involves a lot of copying. The function 'merge_using_or_resolution' also needs at least one dict copy.

If we had a dict joining operator, we could simply write:

dict_default = {'a':None, 'b':[]}
fun(**(funA(dict_default) | funB(dict_default))

No copies, no temporary variables, the functions funA and funB would also become one-liners, no namespace and memory pollution. Also, this code could be better optimized, unlike the previous example.

As you see, there are many benefits of having dict-joining operator.

I want to stress it once again. My examples are quite abstract, but the code patterns are actually quite common.

Thanks
haael