
On Fri, Dec 18, 2015 at 10:22 PM, Random832 random832@fastmail.com wrote:
Andrew Barnert writes:
If you're thinking we could define what multisets should do, despite not having a standard multiset type or an ABC for them, and apply that to values views, the next question is how to do that in better than quadratic time for non-hashable values.
Why aren't all values hashable?
When I think about it, it's a bit strange that "hashable" is so wrapped up in immutability for no other reason than the idea that it's not safe to use a mutable object as a dictionary key if it is changed while it is in the dictionary. Java doesn't have this restriction.
And while the reasoning is certainly defensible in isolation, it sort of goes against the "consenting adults" principle that is used to justify all the _other_ dangerous/questionable things that Python doesn't bother putting technical obstacles in front of.
Why not demand that all objects (except NaN?) can be hashed, and that their hash shall match the equality relationship they define, and that all objects can safely be used as set members and dictionary keys so long as they are not in fact changed while in such a position? Certainly we can't put technical restrictions in the way of defining a __hash__ method that raises an exception, but we can say that if they try to use such an object in a dictionary and compare their values views they're on their own.
For a somewhat more conservative path forward, define a new method __vhash__ which will always return a value (by default based on the object identity), and __hash__ shall return either the same number as __vhash__, or raise an exception if the object is not guaranteed as "immutable" for the purpose of equality comparison.
The link between hashing and immutability is because objects whose hash would change are common, e.g. lists, and using them as dict keys would be very hard to debug for users most likely to make this mistake. The issue is that the dict implementation makes it impossible to find back keys whose hash has changed, other than by linear search, which is unacceptable -- but that's exactly what users will try to debug such issues, i.e., print the dict and notice that the missing key is indeed present.
The consenting adults rule typically applies to things that are well hidden or marked (e.g. using __dunder__ names). There are plenty of things that Python could allow but doesn't, not because they are hard to implement or would violate an invariant of the interpreter, but because they could trip over naive users.
Note that you are turning things upside down: the question "why aren't all things hashable" came about because Andrew was considering making a hash table of the values of a dict. But the real question here isn't "why aren't all things hashable" but "why can't you put mutable values into a set". The answer to the latter is because when we put a value into a container, and later the value changes, we can't tell the container, so if the container has any sort of lookup scheme other than linear search, it would lose track of the value.
Hashing comes into play because all of Python's common data structures use hashing to optimize lookup -- but if we used a different data structure, e.g. something based on sorting the keys, we'd still have the mutability problem. And we'd have worse problems, because values would have to be sortable, which is a stricter condition than being immutable.
In any case, you can't solve this problem by making all values hashable.