[Python-Dev] Dataclasses and correct hashability
Elvis Pranskevichus
elprans at gmail.com
Fri Feb 2 10:38:26 EST 2018
On Friday, February 2, 2018 10:08:43 AM EST Eric V. Smith wrote:
> However, I don't feel very strongly about this. As I've said, I expect
> the use cases for hash=True to be very, very rare.
Why do you think that the requirement to make a dataclass hashable is a
"very, very rare" requirement? The moment you want to use a dataclass a
a dict key, or put it in a set, you need it to be hashable.
Just put yourself in the shoes of an average Python developer. You try
to put a dataclass in a set, you get a TypeError. Your immediate
reaction is to add "hash=True". Things appear to work. Then, you, or
someone else, decides to mutate the dataclass object and then you are
looking at a very frustrating debug session.
> In all, I think we're better off documenting best practices and making
> them the default, like attrs does, and leave it to the programmer to
> follow them. I realize we're handing out footguns
I don't think attrs doing the same thing is a valid justification. This
is a major footgun that is very easy to trigger, and there's really no
precedent in standard data types.
> the alternatives seem even more complex and are limiting.
The alternative is simple and follows the design of other standard
containers: immutable containers are hashable, mutable containers are
not. @dataclass(frozen=False) gives you a SimpleNamespace-like and
@dataclass(frozen=True) gives you a namedtuple-like. If you _really_
know what you are doing, then you can always declare an explicit
__hash__.
> The problem with dropping hash=True is: how would you write __hash__
> yourself?
Is "def __hash__(self): return hash((self.field1, self.field2))" that
hard? It is explicit, and you made a concious choice, i.e you
understand how __hash__ works. IMO, the danger of
"@dataclass(hash=True)" far overweighs whatever convenience it might
provide.
Elvis
More information about the Python-Dev
mailing list