[Python-ideas] dictionary constructor should not allow duplicate keys

Luigi Semenzato luigi at semenzato.com
Tue May 3 20:46:33 EDT 2016


On Tue, May 3, 2016 at 4:51 PM, Steven D'Aprano <steve at pearwood.info> wrote:
> Hi Luigi,
> On Mon, May 02, 2016 at 02:36:35PM -0700, Luigi Semenzato wrote:
> [...]
>> lives_in = { 'lion': ['Africa', 'America'],
>>              'parrot': ['Europe'],
>>              #... 100+ more rows here
>>              'lion': ['Europe'],
>>              #... 100+ more rows here
>>            }
>> The above constructor overwrites the first 'lion' entry silently,
>> often causing unexpected behavior.
> [...]
>> For context, someone ran into this problem in my team at Google (we
>> fixed it using pylint).  I haven't seen any valid reason (in the bug
>> or elsewhere) in favor of these constructor semantics.  From the
>> discussions I have seen, it seems just an oversight in the
>> implementation/specification of dictionary literals.  I'd be happy to
>> hear stronger reasoning in favor of the status quo.
> As far as I can see, you haven't specified in *detail* what change you
> wish to propose. It is difficult for me to judge your proposal when I
> don't know precisely what you are proposing.

Yes, and I apologize, because on further reflection I don't even know
what is possible within the model of the Python interpreter.

> Should duplicate keys be a SyntaxError at compile time, or a TypeError
> at runtime? Or something else?

Is there such a thing as a SyntaxWarning?  From my perspective it
would be fine to make it a SyntaxError, but I am not sure it would be
overall a good choice for legacy code (i.e. as an error it might break
old code, and I don't know how many other things a new language
specification is going to break).

It could also be a run-time error, but it might be nicer to detect it
earlier.  Maybe both.

> What counts as "duplicate keys"?  I presume that you mean that two keys
> count as duplicate if they hash to the same value, and are equal. But
> you keep mentioning "literals" -- does this mean you care more about
> whether they look the same rather than are the same?

Correct.  The errors that I am guessing matter the most are those for
which folks copy-paste a key-value pair, where the key is a literal
string, intending to change the key, and then forget to change it.

> # duplicate literals, forbidden
> d = {100: 1, 100: 2}

Correct.  Possibly caught during parsing.

> # duplicate non-literals, allowed
> d = {100: 1, len("ab")*50: 2}

For sure allowed during parsing, possibly caught at run-time.

> You keep mentioning "dictionary literal", but there actually is no
> such thing in Python. I think you mean a dict display.

Yes sorry, bad use of the term.  I meant the curly-brace
constructor---and only that.

> (Don't worry, I
> make the same mistake.) But the point is, a "dict literal" (display) can
> contain keys which are not themselves literals, as above. Those keys can
> have arbitrarily complex semantics, including side-effects. What do you
> expect to happen?

I'd mainly like to catch the "obvious" duplicates, just like the GNU C
compiler catches the "obvious" divisions by zero.  And by that I mean
occurrences of duplicate string literals as keys.  For instance:

At parse time:

{"foo": 333, "foo": 444}  # forbidden---must catch
{"foo": 333, "f" + "oo": 444}  # OK to miss but also OK to catch
{"foo": 333, function_returning_foo(): 444}  # not caught (i.e. no
miracles expected)

At run time, I still slightly suspect that it may be more useful to
prohibit duplicate keys in the same constructor than it is to allow
them, but I don't have a clear opinion, because of both legacy code
and other possible semantic issues.


> --
> Steve
> _______________________________________________
> Python-ideas mailing list
> Python-ideas at python.org
> https://mail.python.org/mailman/listinfo/python-ideas
> Code of Conduct: http://python.org/psf/codeofconduct/
> --
> ---
> You received this message because you are subscribed to a topic in the Google Groups "python-ideas" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/topic/python-ideas/yPH6ukUAQjc/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to python-ideas+unsubscribe at googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

More information about the Python-ideas mailing list