[Python-ideas] dictionary constructor should not allow duplicate keys

Tue May 3 20:23:00 EDT 2016

On Tue, May 03, 2016 at 02:27:16PM -0700, Ethan Furman wrote:
> On 05/03/2016 01:43 PM, Michael Selik wrote:
> >On Tue, May 3, 2016 at 4:00 PM Ethan Furman wrote:
> 
> >>Which seems irrelevant to your argument: a duplicate key is a duplicate
> >>key whether it's 123 or 'xyz'.
> >
> >If an expression includes an impure function, then the duplication of
> >assignment to that key may have a desirable side-effect.
> 
> I'm willing to say that should be done with an existing dict, not in a 
> literal.

But are you willing to say that the compiler should enforce that 
stylistic judgement?

> >How would you handle an expression that evaluates differently for each
> >call? For example:
> >
> >     {random(): 0, random(): 1}
> 
> Easy:  Don't Do That.  ;)

I see your wink, so I presume you're not actually suggesting that the 
compiler (1) prohibit all function calls in dict displays, or (2) 
hard-code the function name "random" in a black list.

So what are you actually suggesting? Michael is raising a good point 
here. If you don't like random as an example, how about:

d = {spam(a): 'a', spam(b): 'BB', spam(c): 'Ccc'}

I'm intentionally not giving you the values of a, b or c, or telling you 
what spam() returns. Now you have the same information available to 
you as the compiler has at compile time. What do you intend to do?

It's one thing to say "duplicate keys should be prohibited", and another 
to come up with a detailed explanation of what precisely should happen.

> >Let me flip the original request: I'd like to hear stronger arguments for
> > change, please. I'd be particularly interested in hearing how often 
> > Pylint has caught this mistake.
> 
> Well, when this happened to me I spent a loooonnnnnngggg time figuring 
> out what the problem is.

Okay, but how often does this happen? Daily? Weekly? Once per career?

What's a loooonnnnnngggg time? Twenty minutes? Twenty man-weeks?

> One just doesn't expect duplicate keys to not raise:
> 
> --> dict(one=1, one='uno')
>   File "<stdin>", line 1
> SyntaxError: keyword argument repeated

Huh, I had completely forgotten that duplicate keyword arguments raise 
a syntax error. I thought you got a runtime TypeError. 

There's a difference though. Keyword argument *names* must be 
identifiers, not expressions, and duplicates can be recognised by the 
compiler at compile-time:

py> def spam(**kw): pass
...
py> spam(eggs=print("side-effect"))
side-effect
py> spam(eggs=print("side-effect"), eggs=print('another side-effect'))
  File "<stdin>", line 1
SyntaxError: keyword argument repeated

Duplicate keys are not, and in general can't be:

py> d = {'eggs': print("side-effect"), 
...      'eggs': print('another side-effect')}
side-effect
another side-effect
py> d
{'eggs': None}

If you think of the dict constructor as something like:

for key,item in initial_values:
    self[key] = item

then the current behaviour is perfectly valid, and the advice is, if 
you don't want duplicate keys, don't use them in the first place.

If you think of it as:

for key,item in initial_values:
    if key in self:
        raise TypeError('duplicate key')
    else:
        self[key] = item

then you have to deal with the fact that you might only notice a 
duplicate after mutating your dict, which may include side-effects.

-- 
Steve