[Python-Dev] For Python 3k, drop default/implicit hash, and comparison

"Martin v. Löwis" martin at v.loewis.de
Sun Nov 27 00:48:50 CET 2005


Noam Raphael wrote:
 > I would greatly appreciate repliers that find a tiny bit of reason in
 > what I said (even if they don't agree), and not deny it all as a
 > complete load of rubbish.

I don't understand what your message is. With this posting, did you
suggest that somebody does something specific? If so, who is that one,
and what should he do?

Anyway, a lot of your posting is what I thought was common knowledge;
and with some of it, I disagree.

> In two sentences: sometimes you wish to compare objects according to
> "identity", and sometimes you wish to compare objects according to
> "values". Identity-based comparison is done by the "is" operator;
> Value-based comparison should be done by the == operator.

Certainly.

> We may want to compare wheels based on value, for example to make sure
> that all the car's wheels fit together nicely: assert car.wheel1 ==
> car.wheel2 == car.wheel3 == car.wheel4.

I would never write it that way. This would suggest that the wheels
have to be "the same". However, this is certainly not true for wheels:
they have to have to be of the same make. Now, you write that wheels
only carry manufacturer and diameter. However, I would expect that
wheels grow additional attributes over time, like whether they are
left or right, and what their wear level is. So to write your property,
I would write

car.wheel1.manufacturer_and_make() ==
car.wheel2.manufacturer_and_make() ==
car.wheel3.manufacturer_and_make() ==
car.wheel4.manufacturer_and_make()

> We may want to associate values with wheels based on their values. For
> example, it's reasonable to suppose that the price of every wheel of
> the same model is the same. In that case, we'll write: price[wheel] =
> 25. 

Again, I would not write it this way. I would find

wheel.price()

most natural. If I have the notion of a price list, then I would
try to understand what the price list is keyed-by, e.g. model number:

price[wheel.model] = 25

> Now again, how will we say that a specific wheel is broken? Like this:
> 
> broken[Ref(wheel)] = True

If I want things to be keyed by identity, I would write

broken = IdentityDictionary()
...
broken[wheel] = True

although I would prefer to write

wheel.broken = True

> I think that most objects, especially most user-defined objects, have
> a *value*. I don't have an exact definition, but a hint is that two
> objects that were created in the same way have the same value.

Here I disagree. Consider the wheel example. I would expect that
a wheel has a "wear level" or some such, and that this changes over
time, and that it belongs to the "value" of the wheel ("value"
being synonym to "state"). As this changes over time, it is certainly
not that the object is created with that value.

Think of lists: what is their value? Are they created with it?

> Sometimes we wish to use the
> identity of objects as a dictionary key or as a set member - and I
> claim that we should do that by using the Ref class, whose *value* is
> the object's *identity*, or by using a dict/set subclass, and not by
> misusing the __hash__ and __eq__ methods.

I think we should a specific type of dictionary then.

> I think that whenever value-based comparison is meaningful, the __eq__
> and __hash__ should be value-based. Treating objects by identity
> should be done explicitly, by the one who uses the objects, by using
> the "is" operator or the Ref class. It should not be the job of the
> object to decide which method (value or identity) is more useful - it
> should allow the user to use both methods, by defining __eq__ and
> __hash__ based on value.

If objects are compared for value equality, the object should decide
which part of its state goes into that comparison. It may be that
two objects compare equal even though their state is memberwise
different:

Rational(1,2) == Rational(5,10)

> Please give me examples which prove me wrong. I currently think that
> the only objects for whom value-based comparison is not meaningful,
> are objects which represent entities which are "outside" of the
> process, or in other words, entities which are not "computational".

You mean, things of the real world, right? Like people, bank accounts,
and wheels.

Regards,
Martin


More information about the Python-Dev mailing list