More damage to intuition (was RE: [Python-Dev] Comparison of recursive objects)

Guido van Rossum guido@digicool.com
Sun, 21 Jan 2001 11:51:45 -0500


[Tim, complaining that numerical types are no longer lumped together
in default comparisons:]
> I've often used list.sort() on a heterogeneous list simply to bring the
> elements of the same type next to each other.  But as "try number 5" shows,
> I can no longer rely on even getting all the lists together.  Indeed,
> heterogenous list.sort() has become a very bad (biased and slow)
> implementation of random.shuffle() <wink>.
> 
> Under 2.0, the program never prints "oops", because the only violations of
> transitivity in 2.0's ordering of builtin types were bugs in the
> implementation (none of which show up in this simple test case); 2.0's
> .sort() *always* produces
> 
> [0L, 0L, 0L, 0L, 0L, 1, 1, 1, 1, 1, [1], [1], [1], [1], [1]]
> 
> The base trick in 2.0 was sound:  when falling back to the "compare by name
> of the type" last resort, treat all numeric types as if they had the same
> name.
> 
> While Python can't enforce that any user-defined __cmp__ is consistent, I
> think it should continue to set a good example in the way it implements its
> own comparisons.

I think I can put this behavior back.  (I believe that before I
reorganized the comparison code, it seemed really tricky to do this,
but after refactoring the code, it's quite easy to do.)

My only concern is that under the old schele, two different numeric
extension types that somehow can't be compared will end up being
*equal*.  To fix this, I propose that if the names compare equal, as a
last resort we compare the type pointers -- this should be consistent
too.

Here's a patch that stops your test program from reporting failures:

*** object.c	2001/01/21 16:25:18	2.112
--- object.c	2001/01/21 16:50:16
***************
*** 522,527 ****
--- 522,528 ----
  default_3way_compare(PyObject *v, PyObject *w)
  {
  	int c;
+ 	char *vname, *wname;
  
  	if (v->ob_type == w->ob_type) {
  		/* When comparing these pointers, they must be cast to
***************
*** 550,557 ****
  	}
  
  	/* different type: compare type names */
! 	c = strcmp(v->ob_type->tp_name, w->ob_type->tp_name);
! 	return (c < 0) ? -1 : (c > 0) ? 1 : 0;
  }
  
  #define CHECK_TYPES(o) PyType_HasFeature((o)->ob_type, Py_TPFLAGS_CHECKTYPES)
--- 551,571 ----
  	}
  
  	/* different type: compare type names */
! 	if (v->ob_type->tp_as_number)
! 		vname = "";
! 	else
! 		vname = v->ob_type->tp_name;
! 	if (w->ob_type->tp_as_number)
! 		wname = "";
! 	else
! 		wname = w->ob_type->tp_name;
! 	c = strcmp(vname, wname);
! 	if (c < 0)
! 		return -1;
! 	if (c > 0)
! 		return 1;
! 	/* Same type name, or (more likely) incomparable numeric types */
! 	return (v->ob_type < w->ob_type) ? -1 : 1;
  }
  
  #define CHECK_TYPES(o) PyType_HasFeature((o)->ob_type, Py_TPFLAGS_CHECKTYPES)

Let me know if you agree with this.

--Guido van Rossum (home page: http://www.python.org/~guido/)