keying by identity in dict and set
Peter Otten
__peter__ at web.de
Sun Oct 20 11:14:02 EDT 2019
Steve White wrote:
> Hi Chris,
>
> Yes, I am aware of the hash of small integers. But I am not keying
> with small integers here: I am keying with id() values of class
> instances.
The id() values /are/ smallish integers though.
(I would guess that this is baked into the CPython source, but did not
actually check.)
> Precisely what my example shows is that the dict/set algorithms in
> fact *never* call __eq__, when the id() of a class instance is
> returned by __hash__ (in the implementations of Python I have tested).
> Please try the code yourself. Tell me what I am missing.
I'd state that a bit differently:
(0) Key objects in dicts/sets are only compared if they have the same hash
value.
The comparison works in two steps:
(1) Are both objects the same (a is b --> True)? Then assume they are equal.
(2) Are they different objects (a is b --> False)? Take the slow path and
actually invoke __eq__
With a sufficiently weird equality:
>>> class A:
... def __eq__(self, other): return False
...
>>> a = A()
>>> items = [a]
>>> a in items
True
>>> [v for v in items if v == a]
[]
As you can see the hash is not involved (not even defined).
> What "other problems"? Please provide an example!
>
> Thanks!
>
> On Sat, Oct 19, 2019 at 9:02 PM Chris Angelico <rosuav at gmail.com> wrote:
>>
>> On Sun, Oct 20, 2019 at 3:08 AM Steve White <stevan.white at gmail.com>
>> wrote:
>> > It would appear that if __hash__ returns the id, then that id is used
>> > internally as the key, and since the id is by definition unique, no
>> > key collision ever occurs -- at least in every Python implementation
>> > I've tried. It also seems that, for a class instance obj,
>> > hash( hash( obj ) ) == hash( obj )
>> > hash( id( obj ) ) == id( obj )
>> > These are very strong and useful properties. Where are they
>> > documented?
>>
>> There are two rules that come into play here. One is that smallish
>> integers use their own value as their hash (so hash(1)==1 etc); the
>> other is that dictionaries actually look for something that matches on
>> identity OR equality. That's why using identity instead of equality
>> will appear to work, even though it can cause other problems when you
>> mismatch them.
>>
>> ChrisA
>> --
>> https://mail.python.org/mailman/listinfo/python-list
More information about the Python-list
mailing list