[Python-Dev] On __cmp__ specifications and implementation (to mimic it on Jython)

Leo Soto M. leo.soto at gmail.com
Tue Aug 5 22:43:25 CEST 2008


Hi Python developers!

First, a quick introduction: My name is Leonardo Soto, I'm a  GSoC
2008 student working under PSF and the Jython project, and a Jython
commiter since one month ago.

As part of what I did on this GSoC, I've improved Jython's __cmp__ a
bit. That made me go to look at CPython sources (v2.5.2) to read the
implementation of cmp(a,b).

Aside from what I've already fixed on Jython, I've found four
differences between Jython and CPython cmp(a, b):

(1). CPython's cmp() uses a.__cmp__(b) before rich-cmp methods if the
types of the two arguments are the same.

(2). When using rich-cmp methods, CPython first check if the second
type is a subtype of the first, and starts trying with the second
method instead of the first. [i.e, the process starts by b.__gt__(a)
instead of a.__lt__(b) if type(b) != type(a) and issubclass(b, a)]

(3). Similar to above, if the second argument to cmp(a, b) is a
old-style instance, and the first isn't, b.__cmp__(a) is used instead
of a.__cmp__(b).

(4). CPython tries coercion as part of the three way comparison. But
doesn't do it exactly as advertised on
<http://docs.python.org/ref/coercion-rules.html>: it does coercion
*after* trying a.__cmp__(b), while the docs says *before*. And it only
uses the coerced values if both values have the same tp_compare
function. I fail to see why, and this is not mentioned on the docs
either.

[The examples that show this behaviors are at the end of this mail.]

Now, my questions:

- Are (1), (2) and (3) intentional features?  Frankly, Jython being
different on (1) and (3) isn't a big deal, but I'd like to match
CPython as much as possible (and reasonable). Point (2) is a more
interesting, because it seems to follow what
<http://docs.python.org/ref/coercion-rules.html> says for binary ops:

"Exception to the previous item: if the left operand is an instance of
a built-in type or a new-style class, and the right operand is an
instance of a proper subclass of that type or class and overrides the
base's __rop__() method, the right operand's __rop__() method is tried
before the left operand's __op__() method."

But I haven't found any documentation telling that it applies to
rich-cmp methods too. Interestingly, it *doesn't* apply to __cmp__ on
CPython 2.5.2.

BTW, PyPy also fails to follow CPython on points (2) and (3), so it is
not just me failing to found the appropriate docs ;-)

- What's the idea behind the current implemention of __cmp__ and
coercion, as described on (4)? I've to implement this feauture on
Jython, but checking that two Java instances have the same __cmp__
method is not that easy as in C. And more important: it could be
wrong, depending on what is the idea behind the current CPython
implementation.

Finally, here are the examples of the behaviors outlined at the start
of this mail.

(1):

>>> class A(object):
...   def __eq__(self, other): return True
...   def __cmp__(self, other): return 1
...
>>> cmp(A(), A())
1

(2):

>>> class A(object):
...  def __lt__(self, other): return True
...  def __gt__(self, other): return False
...  def __eq__(self, other): return False
...
>>> class B(A): pass
...
>>> cmp(A(), B())
1

(3):

>>> class A(object):
...    def __cmp__(self, other): return 1
...
>>> class B:
...    def __cmp__(self, other): return 1
...
>>> cmp(A(), B())
-1

(4):

>>> class A(object):
...     def __coerce__(self, other): return 0, 0
...     def __cmp__(self, other): return -1
...
>>> cmp(A(), A())
-1

Regards,
-- 
Leo Soto M.
http://blog.leosoto.com


More information about the Python-Dev mailing list