[Python-Dev] Comparison speed
Tim Peters
tim.one@home.com
Sun, 20 May 2001 05:31:25 -0400
[M0artin v. Loewis]
> ...
> If I set tp_richcompare of strings to 0, I get past this code, and do
>
> c = (*f)(v, w);
> if (PyErr_Occurred())
Note that the usual way to write this is
if (c < 0 && PyErr_Occurred())
More work for my artificial "ab" < "cd" case but a net win in real life (when
c >= 0, it's an internal error if PyErr_Occurred() were to return true; alas,
when c < 0 there's no way in the cmp protocol to use c's value alone to
distinguish between "less than" and "error").
> return NULL;
> return convert_3way_to_object(op, c);
>
> Here, I get 3 function calls: f is string_compare, then
> PyErr_Occurred, finally convert_3way_to_object, which converts
> {-1,0,1} x Op -> {Py_True, Py_False}.
Unfortunately, it's 4 calls, as PyErr_Occurred() isn't a leaf.
> Indeed, when I inline convert_3way_to_object, I get the same speed in
> both cases (with the remaining differences attributed to measurement
> and gcc doing register usage differently in both functions).
OK, understood, and thanks for following up!
> I'd still be in favour of giving strings a richcompare, since it
> allows to optimize what I think is the single most frequent case:
> Py_EQ on strings.
In the absence of significant sorting, I agreed Py_EQ is most frequent.
> With a control flow like
>
> if (a->ob_size != b->ob_size)
> goto False;
>
> if (a->ob_size == 0)
> goto True;
>
> if (a->ob_sval[0] != b->ob_sval[0])
> goto False;
>
> if(memcmp(a->ob_sval, b->ob_sval, a->ob_size))
> goto False;
> else
> goto True;
>
> we can reduce the number of function calls
Suggest collapsing the third into the first:
if (a->ob_size != b->ob_size
|| a->ob_sval[0] != b->ob_sval[0])
goto False;
There's no danger of over-indexing when ob_size==0, because it doesn't
include the trailing null byte Python always sticks at the end of string
objects; and the first-byte check is much more likely to pay off than the
zero-length check (comparison to a null string? gotta be rare as clear
conclusions <wink>), and better to test for the more common case first.