[Python-ideas] dictionary constructor should not allow duplicate keys
Terry Reedy
tjreedy at udel.edu
Wed May 4 21:36:05 EDT 2016
On 5/4/2016 12:03 AM, Nick Coghlan wrote:
> On 4 May 2016 at 11:40, Ethan Furman <ethan at stoneleaf.us> wrote:
>> On 05/03/2016 05:23 PM, Steven D'Aprano wrote:
>>> I'm intentionally not giving you the values of a, b or c, or telling
>>> you what spam() returns. Now you have the same information available
>>> to you as the compiler has at compile time. What do you intend to do?
>>
>> Since the dict created by that dict display happens at run time, I am
>> suggesting that during the process of creating that dict that any keys,
>> however generated or retrieved, that are duplicates of keys already in the
>> dict, cause an appropriate error (to be determined).
>
> I was curious as to whether or not it was technically feasible to
> implement this "duplicate keys" check solely for dict displays in
> CPython without impacting other dict use cases, and it turns out it
> should be.
>
> The key point is that BUILD_MAP already has its own PyDict_SetItem()
> loop in ceval.c (iterating over stack entries), and hence doesn't rely
> on the "dict_common_update" code shared by dict.update and the dict
> constructor(s).
Changing only BUILD_MAP would invalidate current code equivalences and
currently sensible optimizations. A toy example:
>>> d1 = {'a': 'a1'}
>>> d2 = {f(): 'a2'}
>>> d1.update(d2)
>>> d1
{'a': 'a2'}
Sensible and comprehensible code transformation rule are important.
Currently, the following rather trivial optimization of the above works.
>>> d1 = {'a': 'a1', f(): 'a2'}
>>> d1
{'a': 'a2'}
The special rule for dict displays would invalidate this.
In my post yesterday in response to Luigi (where I began 'The
thread...'), I gave 4 equivalent other ways to initialize a dict using a
Python loop (include a dict comprehension). Using a dict display
amounts to un-rolling any of the loops and replacing the Python loop
with the C loop buried in ceval. Changing the operation of just that
loop would break the current equivalence.
I suspect that the proposed change would introduce more bugs than it
exposes.
--
Terry Jan Reedy
More information about the Python-ideas
mailing list