On Fri, Dec 18, 2015 at 10:22 PM, Random832 <random832@fastmail.com> wrote:
Andrew Barnert  writes:
> If you're thinking we could define what multisets should do, despite
> not having a standard multiset type or an ABC for them, and apply that
> to values views, the next question is how to do that in better than
> quadratic time for non-hashable values.

Why aren't all values hashable?

When I think about it, it's a bit strange that "hashable" is so
wrapped up in immutability for no other reason than the idea
that it's not safe to use a mutable object as a dictionary key
if it is changed while it is in the dictionary.  Java doesn't
have this restriction.

And while the reasoning is certainly defensible in isolation, it
sort of goes against the "consenting adults" principle that is
used to justify all the _other_ dangerous/questionable things
that Python doesn't bother putting technical obstacles in front
of.

Why not demand that all objects (except NaN?) can be hashed, and
that their hash shall match the equality relationship they
define, and that all objects can safely be used as set members
and dictionary keys so long as they are not in fact changed
while in such a position? Certainly we can't put technical
restrictions in the way of defining a __hash__ method that
raises an exception, but we can say that if they try to use such
an object in a dictionary and compare their values views they're
on their own.

For a somewhat more conservative path forward, define a new
method __vhash__ which will always return a value (by default
based on the object identity), and __hash__ shall return either
the same number as __vhash__, or raise an exception if the
object is not guaranteed as "immutable" for the purpose of
equality comparison.

The link between hashing and immutability is because objects whose hash would change are common, e.g. lists, and using them as dict keys would be very hard to debug for users most likely to make this mistake. The issue is that the dict implementation makes it impossible to find back keys whose hash has changed, other than by linear search, which is unacceptable -- but that's exactly what users will try to debug such issues, i.e., print the dict and notice that the missing key is indeed present.

The consenting adults rule typically applies to things that are well hidden or marked (e.g. using __dunder__ names). There are plenty of things that Python could allow but doesn't, not because they are hard to implement or would violate an invariant of the interpreter, but because they could trip over naive users.

Note that you are turning things upside down: the question "why aren't all things hashable" came about because Andrew was considering making a hash table of the values of a dict. But the real question here isn't "why aren't all things hashable" but "why can't you put mutable values into a set". The answer to the latter is because when we put a value into a container, and later the value changes, we can't tell the container, so if the container has any sort of lookup scheme other than linear search, it would lose track of the value.

Hashing comes into play because all of Python's common data structures use hashing to optimize lookup -- but if we used a different data structure, e.g. something based on sorting the keys, we'd still have the mutability problem. And we'd have worse problems, because values would have to be sortable, which is a stricter condition than being immutable.

In any case, you can't solve this problem by making all values hashable.

--
--Guido van Rossum (python.org/~guido)