<html>

  <head>

    <meta content="text/html; charset=windows-1252"

      http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    On 04/05/2016 00:51, Steven D'Aprano wrote:<br>

    <blockquote cite="mid:20160503235120.GA12028@ando.pearwood.info"

      type="cite">

      <pre wrap="">Hi Luigi,

On Mon, May 02, 2016 at 02:36:35PM -0700, Luigi Semenzato wrote:

[...]

</pre>

      <blockquote type="cite">

        <pre wrap="">lives_in = { 'lion': ['Africa', 'America'],

             'parrot': ['Europe'],

             #... 100+ more rows here

             'lion': ['Europe'],

             #... 100+ more rows here

           }

The above constructor overwrites the first 'lion' entry silently,

often causing unexpected behavior.

</pre>

      </blockquote>

      <pre wrap="">[...]

</pre>

      <blockquote type="cite">

        <pre wrap="">For context, someone ran into this problem in my team at Google (we

fixed it using pylint).  I haven't seen any valid reason (in the bug

or elsewhere) in favor of these constructor semantics.  From the

discussions I have seen, it seems just an oversight in the

implementation/specification of dictionary literals.  I'd be happy to

hear stronger reasoning in favor of the status quo.

</pre>

      </blockquote>

      <pre wrap="">

As far as I can see, you haven't specified in *detail* what change you 

wish to propose. It is difficult for me to judge your proposal when I 

don't know precisely what you are proposing.

Should duplicate keys be a SyntaxError at compile time, or a TypeError 

at runtime? Or something else?

What counts as "duplicate keys"?  I presume that you mean that two keys 

count as duplicate if they hash to the same value, and are equal. But 

you keep mentioning "literals" -- does this mean you care more about 

whether they look the same rather than are the same?

# duplicate literals, forbidden

d = {100: 1, 100: 2}

# duplicate non-literals, allowed

d = {100: 1, len("ab")*50: 2}

You keep mentioning "dictionary literal", but there actually is no 

such thing in Python. I think you mean a dict display. (Don't worry, I 

make the same mistake.) But the point is, a "dict literal" (display) can 

contain keys which are not themselves literals, as above. Those keys can 

have arbitrarily complex semantics, including side-effects. What do you 

expect to happen?

</pre>

    </blockquote>

    My initial reaction to the OP was negative, as in most contexts

    where keys are added to dictionaries, repeated keys silently

    overwrite, and consistency is a virtue.<br>

    However, the original use case (a long dict literal - (sorry,

    transcribe that into 'dict display' if you wish; to me 'dict

    literal' is an evocative and comprehensible term and I will continue

    to use it, not meaning to offend)) is a completely plausible one,

    and it now seems to me that detecting duplicates is a case where

    'practicality beats purity'.  I agree with everything in Luigi's

    post of 30-May 2016 22:29 (my local time, sorry).  I would just add

    that as someone who has used a linter (pylint) occasionally - I am

    trying to discipline myself to use it regularly - even if you are

    aware of it, there is still a barrier to its use: linters tend to

    produce a huge amount of mostly useless guff which you have to

    search to find the few nuggets that you really need.<br>

    <br>

    Steven, I am sure that this is not your intention, but it feels as

    if your requests for clarification are nitpicking and attempts to

    throw dust in our eyes, and avoiding a serious attempt to address

    the OP.<br>

    But let me (presuming,if he will forgive me, to speak for the OP)

    attempt to answer by making a concrete proposal, as a starting point

    for discussion:<br>

    <br>

    <b>In a dictionary literal, it should be a syntax error to have two

      keys which are literal int or literal basestring values and which

      are equal.<br>

      <br>

    </b>(I say basestring, meaning that    { 'lion' : 'x', u'lion' : 'y'

    }    would be an error.  So of course would    { 18L : 'x', 0x12 :

    'y' }    etc.)<br>

    <br>

    From this starting point, bikeshedding can begin:<br>

        (1) Is there any other kind of literal (i.e. other than int or

    string) which should be included?<br>

              (Float and complex?  Decimal??  What have I missed?)<br>

        (2) Should constant int/string expressions which are folded to

    constants at compile time also be included<br>

                    i.e. would    { 2 : 'x', 1+1 : 'y }    also be an

    error?<br>

               (Suggestion: This is an implementation detail,

    undefined.  It's irrelevant to the practical case.)<br>

        (3) Should a runtime error be raised instead of a SyntaxError ?<br>

             (I can't imagine why, but if it turns out to be easier to

    implement, fine.)<br>

        (4) Should a warning be raised instead of an error?<br>

        (5) Should the action be different if not just the keys, but the

    values also are constant and equal (i.e. a no-effect repeat),<br>

              e.g. a warning instead of an error?<br>

             ( I suggest no.  It's still likely to be unintentional,

    i.e. a mistake that the programmer would want to know about.)<br>

        (6) Are there any other changes to the proposal which would

    simplify implementation, while still addressing the original use

    case?<br>

        (7) If the proposal is accepted, should it appear complete in

    the next Python release, or should there be a more gradual adoption

    process?<br>

        (8) Can/should  anything be done to mitigate breaking automatic

    code generators which do generate duplicate literal keys?<br>

    <br>

    Later:<br>

    While I was composing the above, 1 more post arrived from Luigi and

    from Steven.<br>

    I must reply to two points from Steven:<br>

    <br>

    (1) "When you have significant amount of data in a dict (or any

    other data structure, such as a list, tree, whatever), the

    programmer has to take responsibility for the data validation. Not

    the compiler. Out of all the possible errors, why is "duplicate key"

    so special? Your colleague could have caused unexpected behaviour in

    many ways"<br>

    <br>

    No, no, no.  Computers detect our errors and we would be helpless if

    they didn't.  If this argument were taken to its logical conclusion,

    you would say that it is the programmer's responsibility to ensure

    that an entire (10000-line missile guidance?) program is

    syntactically correct, and all the compiler need do is either say

    "OK" or "Syntax Error" (or worse, try to run a syntactically

    incorrect program).  Yes, Luigi's colleague could have made many

    mistakes, but surely it is a good thing for as many of them as is

    reasonably possible to be detected by the computer?  Then we just

    have to decide what "reasonably possible" is.<br>

    <br>

    (2) "So unless you have a good answer to the question "Why are

    duplicate keys so special that the dict constructor has to guard

    against them, when it doesn't guard against all the other failures

    we have to check for?", I think the status quo should stand.

    There is one obvious answer:

    Duplicate keys are special because, unlike the other errors, the

    dict constructor CAN guard against them.

    That would be a reasonable answer. But I'm not sure if it is special

    <b class="moz-txt-star"><span class="moz-txt-tag">*</span>enough<span

        class="moz-txt-tag">*</span></b> to justify violating the Zen of

    Python. That may be a matter of opinion and taste."<br>

    <pre wrap="">Sure, it is a matter of opinion.  I ask: In the OP, what is the probability that (1) the programmer deliberately included a duplicate literal key (2) the programmer made a mistake?

I concede that the answer may be different for automatic code generators (although if I wrote one that generated duplicate literal keys, I wouldn't be proud of it).

</pre>

    Best wishes<br>

    Rob Cliffe<br>

  </body>

</html>