[Python-Dev] Comparison speed

Tim Peters tim.one@home.com
Sun, 20 May 2001 17:37:09 -0400


[Tim, muses about a Py_CMP value for rich comparisons, and talks
 mostly about dict comps]

> ...
> I'm not sure I see the saving.  There's no real saving in time,
> because you still have to make separate calls for EQ and CMP, right?

Right so far as it goes.  A "fast path" (which currently doesn't exist but is
clearly worth adding, based on both my and Martin's timings) for doing *all*
kinds of same-type comparisons would only have to look for a richcompare
slot, though, not one kind of slot in some cases and another in others.
Uniformity is contagious <wink>.

> There might be a saving in code, but you could solve that internally
> in dictobject.c by restructuring the code somewhat so that
> dict_compare shared more with dict_richcompare, right?

Right, there would be no reduction in total code, and the dict routines
already share as much as possible.  In effect, the body of dict_compare would
replace the last

		res = Py_NotImplemented;

line in the (currently tiny) dict_richcompare guarded by the appropriate
tests.

> It's mostly an API streamlining.

Bingo, and the possibility of retiring the tp_compare slot in P3K.

> The other difference between tp_compare and tp_richcompare is that
> the latter returns an object which makes testing for errors unambiguous.

Also cool.

> But (for several releases) we would still have to support tp_compare
> for b/w compatibility with old 3r party extensions.

Sure, although the way the CVS branch code is going it could be that 2.2 is
the long-awaited utterly incompatible P3K anyway <wink>.

>> The list and tuple richcmps are also doing almost all the work needed
>> to compute a 3-way cmp outcome.

> Ditto.

Oh no!  Those aren't like dict compares.  A rich compare for sequence types
(whether strings or lists) *has* to contain almost all the code necessary to
implement cmp(), because just resolving Py_EQ in all cases has to find "the
first" element (if any) that differs.  Once that's known, you're at most one
measly element compare away from producing the right cmp() outcome.  This
isn't true of dict compares:  the algorithm for resolving dict Py_EQ/Py_NE
when the dict sizes are the same doesn't do anything to help resolve general
cmp().  Yes, a tp_compare slot could be re-added to lists and tuples, and
implemented via refactoring their current tp_richcompare code into a common
internal routine, but then we've just added another layer of function calls
for all cases.  I've timed C function calls, and it turns out they aren't
actually free <wink>.