[Python-Dev] Comparison speed

Martin v. Loewis martin@loewis.home.cs.tu-berlin.de
Thu, 17 May 2001 08:12:18 +0200


> OK, from what I understand, that makes no sense.  Does it to you?

After reviewing everything again, I think I now do: In the richcomp
case, I have

			res = (*f1)(v, w, op);
			if (res != Py_NotImplemented)
				return res;

f1 is string_richcompare, so I get 2 function calls inside do_richcmp:
one to string_richcompare, the other one to string_compare, as my
optimizations are not triggered in your example.

If I set tp_richcompare of strings to 0, I get past this code, and do

		c = (*f)(v, w);
		if (PyErr_Occurred())
			return NULL;
		return convert_3way_to_object(op, c);

Here, I get 3 function calls: f is string_compare, then
PyErr_Occurred, finally convert_3way_to_object, which converts
{-1,0,1} x Op -> {Py_True, Py_False}.

Indeed, when I inline convert_3way_to_object, I get the same speed in
both cases (with the remaining differences attributed to measurement
and gcc doing register usage differently in both functions).

I'd still be in favour of giving strings a richcompare, since it
allows to optimize what I think is the single most frequent case:
Py_EQ on strings. With a control flow like

		if (a->ob_size != b->ob_size) 
                   goto False;

		if (a->ob_size == 0) 
                   goto True;

		if (a->ob_sval[0] != b->ob_sval[0])
                   goto False;

		if(memcmp(a->ob_sval, b->ob_sval, a->ob_size))
                   goto False;
                else
                   goto True;

we can reduce the number of function calls 

Regards,
Martin