[Python-Dev] Dataclasses and correct hashability

Eric V. Smith eric at trueblade.com
Tue Feb 6 20:38:45 EST 2018


Sorry for the late reply. Still recovering from a computer failure. 

My only concern with this approach is: what if you don’t want any __hash__ added? Say you want to use your base class’s hashing. I guess you could always “del cls.__hash__” after the class is created, but it’s not elegant. 

That’s what we got from the tri-state option: never add (False), always add (True), or add if it’s safe (None).

--
Eric

> On Feb 5, 2018, at 12:49 AM, Guido van Rossum <guido at python.org> wrote:
> 
> Looks like this is turning into a major flamewar regardless of what I say. :-(
> 
> I really don't want to lose the ability to add a hash function to a mutable dataclass by flipping a flag in the decorator. I'll explain below. But I am fine if this flag has a name that clearly signals it's an unsafe thing to do.
> 
> I propose to replace the existing (as of 3.7.0b1) hash= keyword for the @dataclass decorator with a simpler flag named unsafe_hash=. This would be a simple bool (not a tri-state flag like the current hash=None|False|True). The default would be False, and the behavior then would be to add a hash function automatically only if it's safe (using the same rules as for hash=None currently). With unsafe_hash=True, a hash function would always be generated that takes all fields into account except those declared using field(hash=False). If there's already a `def __hash__` in the function I don't care what it does, maybe it should raise rather than quietly doing nothing or quietly overwriting it.
> 
> Here's my use case.
> 
> A frozen class requires a lot of discipline, since you have to compute the values of all fields before calling the constructor. A mutable class allows other initialization patterns, e.g. manually setting some fields after the instance has been constructed, or having a separate non-dunder init() method. There may be good reasons for using these patterns, e.g. the object may be part of a cycle (e.g. parent/child links in a tree). Or you may just use one of these patterns because you're a pretty casual coder. Or you're modeling something external.
> 
> My point is that once you have one of those patterns in place, changing your code to avoid them may be difficult. And yet your code may treat the objects as essentially immutable after the initialization phase (e.g. a parse tree). So if you create a dataclass and start coding like that for a while, and much later you need to put one of these into a set or use it as a dict key, switching to frozen=True may not be a quick option. And writing a __hash__ method by hand may feel like a lot of busywork. So this is where [unsafe_]hash=True would come in handy.
> 
> I think naming the flag unsafe_hash should take away most objections, since it will be clear that this is not a safe thing to do. People who don't understand the danger are likely to copy a worse solution from StackOverflow anyway. The docs can point to frozen=True and explain the danger.
> 
> -- 
> --Guido van Rossum (python.org/~guido)
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/eric%2Ba-python-dev%40trueblade.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20180206/363174ac/attachment.html>


More information about the Python-Dev mailing list