[Python-Dev] Dataclasses and correct hashability

Glenn Linderman v+python at g.nevcal.com
Mon Feb 5 02:28:03 EST 2018


On 2/4/2018 9:49 PM, Guido van Rossum wrote:
> A frozen class requires a lot of discipline, since you have to compute 
> the values of all fields before calling the constructor. A mutable 
> class allows other initialization patterns, e.g. manually setting some 
> fields after the instance has been constructed, or having a separate 
> non-dunder init() method. There may be good reasons for using these 
> patterns, e.g. the object may be part of a cycle (e.g. parent/child 
> links in a tree). Or you may just use one of these patterns because 
> you're a pretty casual coder. Or you're modeling something external.
>
> My point is that once you have one of those patterns in place, 
> changing your code to avoid them may be difficult. And yet your code 
> may treat the objects as essentially immutable after the 
> initialization phase (e.g. a parse tree). So if you create a dataclass 
> and start coding like that for a while, and much later you need to put 
> one of these into a set or use it as a dict key, switching to 
> frozen=True may not be a quick option. And writing a __hash__ method 
> by hand may feel like a lot of busywork. So this is where 
> [unsafe_]hash=True would come in handy.
>
> I think naming the flag unsafe_hash should take away most objections, 
> since it will be clear that this is not a safe thing to do. People who 
> don't understand the danger are likely to copy a worse solution from 
> StackOverflow anyway. The docs can point to frozen=True and explain 
> the danger.

This is an interesting use case. I haven't got the internals knowledge 
to know just how just different mutable and immutable classes and 
objects are under the hood. But this use case makes me wonder if, even 
at the cost of some performance that "normal" immutable classes and 
objects might obtain, if it would be possible to use the various 
undisciplined initialization patterns as desired, followed by as 
declaration "This OBJECT is now immutable" which would calculate its 
HASH value, and prevent future mutations of the object?

Yes, I'm aware that the decision for immutability has historically been 
done at the class level, not the object level, but in my ignorance of 
the internals, I wonder if that is necessary, for performance or more 
importantly, for other reasons.

And perhaps the implementation is internally almost like two classes, 
one mutable, and the other immutable, and the declaration would convert 
the object from one to the other.  But if I say more, I'd just be babbling.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20180204/401cb7e5/attachment.html>


More information about the Python-Dev mailing list