[Python-Dev] For Python 3k, drop default/implicit hash, and comparison

Sun Nov 27 20:04:25 CET 2005

On 11/27/05, "Martin v. Löwis" <martin at v.loewis.de> wrote:
> Noam Raphael wrote:
>  > I would greatly appreciate repliers that find a tiny bit of reason in
>  > what I said (even if they don't agree), and not deny it all as a
>  > complete load of rubbish.
>
> I don't understand what your message is. With this posting, did you
> suggest that somebody does something specific? If so, who is that one,
> and what should he do?

Perhaps I felt a bit attacked. It was probably my fault, and anyway, a
general message like this is not the proper way - I'm sorry.

>
> Anyway, a lot of your posting is what I thought was common knowledge;
> and with some of it, I disagree.

This is fine, of course.
> > We may want to compare wheels based on value, for example to make sure
> > that all the car's wheels fit together nicely: assert car.wheel1 ==
> > car.wheel2 == car.wheel3 == car.wheel4.
>
> I would never write it that way. This would suggest that the wheels
> have to be "the same". However, this is certainly not true for wheels:
> they have to have to be of the same make. Now, you write that wheels
> only carry manufacturer and diameter. However, I would expect that
> wheels grow additional attributes over time, like whether they are
> left or right, and what their wear level is. So to write your property,
> I would write
>
> car.wheel1.manufacturer_and_make() ==
> car.wheel2.manufacturer_and_make() ==
> car.wheel3.manufacturer_and_make() ==
> car.wheel4.manufacturer_and_make()
>
You may be right in the case of wheels. From time to time, in the real
(programming) world, I encounter objects that I wish to compare by
value - this is certainly the case for built-in objects, but is
sometimes the case for more complex objects.

> > We may want to associate values with wheels based on their values. For
> > example, it's reasonable to suppose that the price of every wheel of
> > the same model is the same. In that case, we'll write: price[wheel] =
> > 25.
>
> Again, I would not write it this way. I would find
>
> wheel.price()

Many times the objects are not yours to add attributes, or may have
__slots__ defined. The truth is that I prefer not to add attributes to
external objects even when it's possible.
>
> most natural. If I have the notion of a price list, then I would
> try to understand what the price list is keyed-by, e.g. model number:
>
> price[wheel.model] = 25
>
Sometimes there's no "key" - it's just the state of the object (what
if wheels don't have a model number?)

> > Now again, how will we say that a specific wheel is broken? Like this:
> >
> > broken[Ref(wheel)] = True
>
> If I want things to be keyed by identity, I would write
>
> broken = IdentityDictionary()
> ...
> broken[wheel] = True
>
> although I would prefer to write
>
> wheel.broken = True
>
I personally prefer the first method, but the second one is ok too.

> > I think that most objects, especially most user-defined objects, have
> > a *value*. I don't have an exact definition, but a hint is that two
> > objects that were created in the same way have the same value.
>
> Here I disagree. Consider the wheel example. I would expect that
> a wheel has a "wear level" or some such, and that this changes over
> time, and that it belongs to the "value" of the wheel ("value"
> being synonym to "state"). As this changes over time, it is certainly
> not that the object is created with that value.
>
> Think of lists: what is their value? Are they created with it?
>
My tounge failed me. I meant: created in the same way = have gone
through the same series of actions. That is:
a = []; a.append(5); a.extend([2,1]); a.pop()
b = []; b.append(5); b.entend([2,1]); b.pop()
a == b

> > Sometimes we wish to use the
> > identity of objects as a dictionary key or as a set member - and I
> > claim that we should do that by using the Ref class, whose *value* is
> > the object's *identity*, or by using a dict/set subclass, and not by
> > misusing the __hash__ and __eq__ methods.
>
> I think we should a specific type of dictionary then.
That's OK too. My point was that the one who uses the objects should
explicitly specify whether he means value-based of identity-based
lookup. This means that if an object has a "value", it should not make
__eq__ and __hash__ be identity-based just to make identity-based
lookup easier and implicit.
>
> > I think that whenever value-based comparison is meaningful, the __eq__
> > and __hash__ should be value-based. Treating objects by identity
> > should be done explicitly, by the one who uses the objects, by using
> > the "is" operator or the Ref class. It should not be the job of the
> > object to decide which method (value or identity) is more useful - it
> > should allow the user to use both methods, by defining __eq__ and
> > __hash__ based on value.
>
> If objects are compared for value equality, the object should decide
> which part of its state goes into that comparison. It may be that
> two objects compare equal even though their state is memberwise
> different:
>
> Rational(1,2) == Rational(5,10)
>
I completely agree. Indeed, the "value of an object" is in many times
not "the value of all its attributes".

> > Please give me examples which prove me wrong. I currently think that
> > the only objects for whom value-based comparison is not meaningful,
> > are objects which represent entities which are "outside" of the
> > process, or in other words, entities which are not "computational".
>
> You mean, things of the real world, right? Like people, bank accounts,
> and wheels.

No, I meant real programming examples. My theory is that most
user-defined classes have a "value", and those that don't are related
to I/O, in some sort of a broad definition of the term. I may be
wrong, so I ask for counter-examples.

Thanks for your reply,
Noam