[Python-Dev] RE: cmp(x,x)

Mon May 24 11:32:22 EDT 2004

  [...discussion of Armin's idea of moving the short-circuiting
      of equality tests comparing an object to itself from
      richcompare (where it affects all objects) to individual
      object implementations (like tuple)...]

Raymond writes:
> Actually, it's a ton of indirection (many steps) not just a single C
> lookup.  Equality testing is one of the most common and most expensive
> ops in Python.

Okay, I may be mistaken about the performance impact. Can anyone do
actual measurements? Given that equality testing is already quite slow,
I'd be surprised if there were a noticable performance difference --
but I'm often surprised by performance behavior... show me.

> My main concern is that Armin is about to do backflips (inserting new
> code into the details of every object's comparison code) just to
> facilitate the otherwise unholy goal of having an non-reflexsive
> equality comparison.

I think that the REAL goal is consistancy in when user-defined
equality comparison overrides (__eq__, __ne__, __cmp__) are invoked
and when they're not. I find this program surprising:

    Python 2.3.4c1 (#52, May 12 2004, 19:37:24) [MSC v.1200 32 bit (Intel)] on
win32

    Type "help", "copyright", "credits" or "license" for more information.
    >>> class NoisyCompare(int):
    ...     def __cmp__(self, other):
    ...         print 'comparing'
    ...         return int.__cmp__(self, other)
    ...
    >>> n = NoisyCompare(3)
    >>> n == n # CASE 1
    comparing
    True
    >>> n < n # CASE 2
    comparing
    False
    >>> cmp(n,n) # CASE 3
    0
    >>> [n] == [n] # CASE 4
    comparing
    True

(Actually, I find it surprising for two reasons... because case 3
does NOT print 'comparing', and because case 4 DOES (and Armin said
it wouldn't).

Allowing NANs to have wierd behavior is a side effect, probably
less important than allowing USERS to define their own classes that
misbehave.

> My main concern is that Armin is about to do backflips (inserting new
> code into the details of every object's comparison code) just to
> facilitate the otherwise unholy goal of having an non-reflexsive
> equality comparison.

Now hang on... why is this in EVERY object's code? I would think that
list, tuple, and dict need it because they often contain recursive
structures. But for most types, why do it at all?

> IMO, it is not unreasonable to insist that equality be reflexsive.  That
> is somewhat basic to the whole notion of equality and certainly reflects
> the assumptions made throughtout the code base.

No, I don't think it's unreasonable either. But is it necessary? We
should try to have a single, simple rule to explain when user-defined
operator-overrides of comparison functions are invoked. If the rule
is that they are ALWAYS invoked, and that the short-circuiting of
identity comparisons (which is certainly necessary for things like
tuple and list) is simply part of the implementation of that type, then
the rule is VERY simple. If the rule is that the user-defined function
is always invoked on differing objects and MAY be invoked if an
object is compared to itself, then that's a lot less clean. Anything
else I can think of is probably too complex to consider.

By the way, Armin introduced this by saying:
> A minor semantic change that creeped in some time ago was an implicit
> assumption that any object x should "reasonably" be expected to compare
> equal to itself.

How long ago was that?

-- Michael Chermside