[Python-Dev] Dataclasses and correct hashability
Eric V. Smith
eric at trueblade.com
Thu Feb 1 20:21:03 EST 2018
On 2/1/2018 8:17 PM, Eric V. Smith wrote:
> On 2/1/2018 7:34 PM, Elvis Pranskevichus wrote:
>> There appears to be a critical omission from the current dataclass
>> implementation: it does not make hash=True fields immutable.
>>
>> Per Python spec:
>>
>> "the implementation of hashable collections requires that a key’s hash
>> value is immutable (if the object’s hash value changes, it will be in
>> the wrong hash bucket)"
>>
>> Yet:
>>
>> import dataclasses
>>
>> @dataclasses.dataclass(hash=True)
>> class A:
>> foo: int = dataclasses.field(hash=True, compare=True)
>>
>> a = A(foo=1)
>>
>> s = set()
>> s.add(a) # s == {a}
>> a.foo = 2
>>
>> print(a in s)
>> print({a} == s}
>> print(s == s)
>>
>> # prints False False True
>>
>>
>> This looks to me like a clearly wrong behavior.
>>
>>
>> Elvis
>
> Data classes do not protect you from doing the wrong thing. This is the
> same as writing:
>
> class A:
> def __init__(self, foo):
> self.foo = foo
> def __hash__(self):
> return hash((self.foo,))
>
> You're allowed to override the parameters to dataclasses.dataclass for
> cases where you know what you're doing. Consenting adults, and all.
I should add: This is why you shouldn't override the default (hash=None)
unless you know what the consequences are. Can I ask why you want to
specify hash=True?
Eric
More information about the Python-Dev
mailing list