[Python-Dev] Comparison speed
Martin v. Loewis
martin@loewis.home.cs.tu-berlin.de
Thu, 17 May 2001 08:41:15 +0200
> 1. String objects are also equal despite being different objects,
> if their ob_sinterned pointers are equal and non-NULL. So if
> you're looking for every trick in & out of the book, that's
> another one.
That does not help. In the entire test suite, there are 0 instances
where strings are compared which are not identical, but have equal
ob_sinterned pointers.
> > So 5% of the calls are with identical strings, for which I can
> > immediately decide the outcome.
>
> But also at the cost of doing a fruitless compare and branch in 95%
> of calls.
Whether there's a fruitless branch depends on your compiler. With gcc
3, you can write
if (__builtin_expect(a == b, 0)) {
and then the body of the if block will be moved out of the way of
linear control flow.
> Any idea where those 800,000 virgin calls to oldcomp are coming
> from? That's a lot.
As far as I could trace it, most of them come from lookdict_string (at
various locations inside this function).
> > #comps: 2949421
> > #memcmps: 917776
> >
> > So still, ca. 30% can be decided by first byte.
>
> Sorry, I couldn't follow this part, except noting that 917776 is about 30% of
> 2949421, in which case I would have expected you to say that 70% can be
> decided by first byte.
Oops, you are right.
> It's clearer that this is going to hurt sorting (& bisect etc), by
> adding yet another layer of function call to get Py_LT resolved (as
> for dict compares too, the string richcmp can't do anything to speed
> up Py_LT that string oldcmp can't do just as efficiently -- indeed,
> that's the great advantage oldcmp's "compare first character" test
> had: that *can* decide Py_LT in one byte much of the time (but
> length comparison cannot)).
So to support sorting better, I should special-case Py_LT in
string_richcompare also, to avoid the function call ?-)
> Note too earlier mail about how adding a richcmp slot to strings will
> suddenly slow cmp(string1, string2) (which is the usual way to program a
> search tree, because cmp() *used* to call a string comparison routine only
> once; but after adding a richcmp slot, each cmp(string1, string2) will call
> the richcmp slot from 1 thru 3 times (data-dependent)).
Yes, that is a serious problem. Fortunately, very few calls in my
programs go to string_compare through cmp() now. But then, your
programs are different, of course...
Regards,
Martin