[Python-Dev] For Python 3k, drop default/implicit hash, and comparison

Samuele Pedroni pedronis at strakt.com
Sun Nov 27 01:13:28 CET 2005

Noam Raphael wrote:
> Three weeks ago, I read this and thought, "well, you have two options
> for a default comparison, one based on identity and one on value, both
> are useful sometimes and Guido prefers identity, and it's OK." But
> today I understood that I still think otherwise.

well, this still belongs to comp.lang.python.

> In two sentences: sometimes you wish to compare objects according to
> "identity", and sometimes you wish to compare objects according to
> "values". Identity-based comparison is done by the "is" operator;
> Value-based comparison should be done by the == operator.
> Let's take the car example, and expand it a bit. Let's say wheels have
> attributes - say, diameter and manufacturer. Let's say those can't
> change (which is reasonable), to make wheels hashable. There are two
> ways to compare wheels: by value and by identity. Two wheels may have
> the same value, that is, they have the same diameter and were created
> by the same manufacturer. Two wheels may have the same identity, that
> is, they are actually the same wheel.
> We may want to compare wheels based on value, for example to make sure
> that all the car's wheels fit together nicely: assert car.wheel1 ==
> car.wheel2 == car.wheel3 == car.wheel4. We may want to compare wheels
> based on identity, for example to make sure that we actually bought
> four wheels in order to assemble the car: assert car.wheel1 is not
> car.wheel2 and car.wheel3 is not car.wheel1 and car.wheel3 is not
> car.wheel2...
> We may want to associate values with wheels based on their values. For
> example, it's reasonable to suppose that the price of every wheel of
> the same model is the same. In that case, we'll write: price[wheel] =
> 25. We may want to associate values with wheels based on their
> identities. For example, we may want to note that a specific wheel is
> broken. For this, I'll first define a general class (I defined it
> before in one of the discussions, that's because I believe it's
> useful):
> class Ref(object):
>     def __init__(self, obj):
>         self._obj = obj
>     def __call__(self):
>         return self._obj
>     def __eq__(self, other):
>         return isinstance(other, ref) and self._obj is other._obj
>     def __hash__(self):
>         return id(self._obj) ^ 0xBEEF
> Now again, how will we say that a specific wheel is broken? Like this:
> broken[Ref(wheel)] = True
> Note that the Ref class also allows us to group wheels of the same
> kind in a set, regardless of their __hash__ method.
> I think that most objects, especially most user-defined objects, have
> a *value*. I don't have an exact definition, but a hint is that two
> objects that were created in the same way have the same value.
> Sometimes we wish to compare objects based on their identity - in
> those cases we use the "is" operator. Sometimes we wish to compare
> objects based on their value - and that's what the == operator is for.
> Sometimes we wish to use the value of objects as a dictionary key or
> as a set member, and that's easy. Sometimes we wish to use the
> identity of objects as a dictionary key or as a set member - and I
> claim that we should do that by using the Ref class, whose *value* is
> the object's *identity*, or by using a dict/set subclass, and not by
> misusing the __hash__ and __eq__ methods.
> I think that whenever value-based comparison is meaningful, the __eq__
> and __hash__ should be value-based. Treating objects by identity
> should be done explicitly, by the one who uses the objects, by using
> the "is" operator or the Ref class. It should not be the job of the
> object to decide which method (value or identity) is more useful - it
> should allow the user to use both methods, by defining __eq__ and
> __hash__ based on value.
> Please give me examples which prove me wrong. I currently think that
> the only objects for whom value-based comparison is not meaningful,
> are objects which represent entities which are "outside" of the
> process, or in other words, entities which are not "computational".
> This includes files, sockets, possibly user-interface objects,
> loggers, etc. I think that objects that represent purely "data", have
> a "value" that they can be compared according to. Even wheels that
> don't have any attributes are simply equal to other wheels, and not
> equal to other objects. Since user-defined classes can interact with
> the "environment" only through other objects or functions, it  is
> reasonable to suggest that they should get a value-based equality
> operator. Many times the value is defined by the __dict__ and
> __slots__ members, so it seems to me a reasonable default.
> I would greatly appreciate repliers that find a tiny bit of reason in
> what I said (even if they don't agree), and not deny it all as a
> complete load of rubbish.

not if you think python-dev is a forum for such discussions
on OO thinking vs other paradigms.

> Thanks,
> Noam
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/pedronis%40strakt.com

More information about the Python-Dev mailing list