[Python-Dev] Dataclasses and correct hashability

Eric V. Smith eric at trueblade.com
Thu Feb 1 20:21:03 EST 2018


On 2/1/2018 8:17 PM, Eric V. Smith wrote:
> On 2/1/2018 7:34 PM, Elvis Pranskevichus wrote:
>> There appears to be a critical omission from the current dataclass
>> implementation: it does not make hash=True fields immutable.
>>
>> Per Python spec:
>>
>> "the implementation of hashable collections requires that a key’s hash
>> value is immutable (if the object’s hash value changes, it will be in
>> the wrong hash bucket)"
>>
>> Yet:
>>
>>      import dataclasses
>>
>>      @dataclasses.dataclass(hash=True)
>>      class A:
>>          foo: int = dataclasses.field(hash=True, compare=True)
>>
>>      a = A(foo=1)
>>
>>      s = set()
>>      s.add(a)   # s == {a}
>>      a.foo = 2
>>
>>      print(a in s)
>>      print({a} == s}
>>      print(s == s)
>>
>>      # prints False False True
>>
>>
>> This looks to me like a clearly wrong behavior.
>>
>>
>>                                      Elvis
> 
> Data classes do not protect you from doing the wrong thing. This is the 
> same as writing:
> 
> class A:
>      def __init__(self, foo):
>          self.foo = foo
>      def __hash__(self):
>          return hash((self.foo,))
> 
> You're allowed to override the parameters to dataclasses.dataclass for 
> cases where you know what you're doing. Consenting adults, and all.

I should add: This is why you shouldn't override the default (hash=None) 
unless you know what the consequences are. Can I ask why you want to 
specify hash=True?

Eric



More information about the Python-Dev mailing list