[Python-ideas] An identity dict

Raymond Hettinger raymond.hettinger at gmail.com
Thu Jun 3 21:43:10 CEST 2010


>> P.S.  ISTM that including subtly different variations of a data type
>> does more harm than good.   Understanding how to use an
>> identity dictionary correctly requires understanding the nuances
>> of object identity, how to keep the object alive outside the dictionary
>> (even if the dictionary keeps it alive, a user still needs an external reference
>> to be able to do a lookup), and knowing that the version proposed for
>> CPython has dramatically worse speed/space performance than
>> a regular dictionary.  The very existence of an identity dictionary in
>> collections is likely to distract a user away from a better solution using:
>> d[id(obj)]=value.
> 
>>> 
>>> Essentially these are places where defined equality should not matter. 
>>> 
>> Essentially, these are cases where an identity dictionary isn't 
>> necessary and would in-fact be worse performance-wise 
>> in every implementation except for PyPy which can compile 
>> the pure python code for indentity_dict.py. 
> 
> 
> Using id() is a workaround but again, a potentially expensive one for platforms with moving GCs. Every object calling for an id() forces additional bookkeeping on their ends. This is only a better solution for CPython.

To be clear, most the examples given so far work with regular dictionaries even without using id().  The exception was something system specific such as the pickling mechanism.

So what we're talking about is the comparatively rare case when an object has an __eq__ method and you want that method to be ignored.  For example, you have two tuples (3,5) and (3,5) which are equal but happen to be distinct in memory and your needs are:
* to treat the two equal objects as being distinct for some purpose
* to run faster than id() runs on non-CPython implementations
* don't care if the code is dog slow on CPython (i.e. slower than if you had used id())
* don't care that the two tuples being distinct is memory is not a guaranteed behavior across implementations (i.e. any implementation is free to make all equal tuples share the same id via interning)

FWIW, I spoke with Jim Baker about this yesterday and he believes that Jython has no need for an identity dict.


Raymond


P.S. If Antoine's keyfuncdict proposal gains traction, it would be possible for other implementations to create a fast special case for key=id. 


More information about the Python-ideas mailing list