[Python-ideas] Fwd: Why do equality tests between OrderedDict keys/values views behave not as expected?
Guido van Rossum
guido at python.org
Sat Dec 19 03:09:06 EST 2015
On Fri, Dec 18, 2015 at 10:22 PM, Random832 <random832 at fastmail.com> wrote:
> Andrew Barnert writes:
> > If you're thinking we could define what multisets should do, despite
> > not having a standard multiset type or an ABC for them, and apply that
> > to values views, the next question is how to do that in better than
> > quadratic time for non-hashable values.
>
> Why aren't all values hashable?
>
> When I think about it, it's a bit strange that "hashable" is so
> wrapped up in immutability for no other reason than the idea
> that it's not safe to use a mutable object as a dictionary key
> if it is changed while it is in the dictionary. Java doesn't
> have this restriction.
>
> And while the reasoning is certainly defensible in isolation, it
> sort of goes against the "consenting adults" principle that is
> used to justify all the _other_ dangerous/questionable things
> that Python doesn't bother putting technical obstacles in front
> of.
>
> Why not demand that all objects (except NaN?) can be hashed, and
> that their hash shall match the equality relationship they
> define, and that all objects can safely be used as set members
> and dictionary keys so long as they are not in fact changed
> while in such a position? Certainly we can't put technical
> restrictions in the way of defining a __hash__ method that
> raises an exception, but we can say that if they try to use such
> an object in a dictionary and compare their values views they're
> on their own.
>
> For a somewhat more conservative path forward, define a new
> method __vhash__ which will always return a value (by default
> based on the object identity), and __hash__ shall return either
> the same number as __vhash__, or raise an exception if the
> object is not guaranteed as "immutable" for the purpose of
> equality comparison.
>
The link between hashing and immutability is because objects whose hash
would change are common, e.g. lists, and using them as dict keys would be
very hard to debug for users most likely to make this mistake. The issue is
that the dict implementation makes it impossible to find back keys whose
hash has changed, other than by linear search, which is unacceptable -- but
that's exactly what users will try to debug such issues, i.e., print the
dict and notice that the missing key is indeed present.
The consenting adults rule typically applies to things that are well hidden
or marked (e.g. using __dunder__ names). There are plenty of things that
Python could allow but doesn't, not because they are hard to implement or
would violate an invariant of the interpreter, but because they could trip
over naive users.
Note that you are turning things upside down: the question "why aren't all
things hashable" came about because Andrew was considering making a hash
table of the values of a dict. But the real question here isn't "why aren't
all things hashable" but "why can't you put mutable values into a set". The
answer to the latter is because when we put a value into a container, and
later the value changes, we can't tell the container, so if the container
has any sort of lookup scheme other than linear search, it would lose track
of the value.
Hashing comes into play because all of Python's common data structures use
hashing to optimize lookup -- but if we used a different data structure,
e.g. something based on sorting the keys, we'd still have the mutability
problem. And we'd have worse problems, because values would have to be
sortable, which is a stricter condition than being immutable.
In any case, you can't solve this problem by making all values hashable.
--
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20151219/ff94ec1d/attachment-0001.html>
More information about the Python-ideas
mailing list