[Python-Dev] Comparison speed

Tim Peters tim.one@home.com
Sun, 20 May 2001 05:54:08 -0400


[Tim]
>> 1. String objects are also equal despite being different objects,
>>    if their ob_sinterned pointers are equal and non-NULL.  So if
>>    you're looking for every trick in & out of the book, that's
>>    another one.

[Martin v. Loewis]
> That does not help. In the entire test suite, there are 0 instances
> where strings are compared which are not identical, but have equal
> ob_sinterned pointers.

Good to know.  Had you tried this a few weeks ago, there would have been
thousands (it so happened that one-character strings weren't being interned
*effectively*, and there were lots of 1-character cases then where #1
applied; that's been fixed; good to know more aren't popping up).

> ...
> Whether there's a fruitless branch depends on your compiler.

A branch instruction is a branch instruction; I didn't distinguish between
taken and non-taken branches, as there's no uniformity in codegen across
platforms.

> With gcc 3, you can write
>
> 	if (__builtin_expect(a == b, 0)) {
>
> and then the body of the if block will be moved out of the way of
> linear control flow.

I don't think we'll be littering Python with compiler-specific hacks.  It's
good to get the less common case out-of-line, but it's not a pure win:  while
it reduces the penalty when the test doesn't pay, it also reduces the benefit
when it does pay (by the wildly architecture-dependent cost of taking a
mispredicted out-of-line branch, and the wildly compiler-dependent costs of
how seriously they take their own decisions or user hints to out-of-line a
block (e.g., the compiler may refetch everything from memory again at the
target if it thinks it's truly rare)).

>> Any idea where those 800,000 virgin calls to oldcomp are coming
>> from?  That's a lot.

> As far as I could trace it, most of them come from lookdict_string (at
> various locations inside this function).

Ah!  Of course.  string_compare is hardwired into lookdict_string.  This case
may be important enough to merit a distinct _PyString_Equal function, with
just the stuff lookdict_string needs (e.g., there's never a gain in testing
for pointer equality when called from lookdict_string because the dict code
already checked that; but there may be a gain for that test in an all-purpose
string_richcompare).

> ...
> So to support sorting better, I should special-case Py_LT in
> string_richcompare also, to avoid the function call ?-)

Of course.  string_richcompare has to do a memcmp to resolve Py_EQ and Py_NE
anway, and that's most of the work for resolving all 6 possibilities.  Get
rid of string_compare entirely!

[on cmp sloth]
> Yes, that is a serious problem. Fortunately, very few calls in my
> programs go to string_compare through cmp() now. But then, your
> programs are different, of course...

There are search-tree modules I have but didn't write that do this; I don't
care enough about them to frustrate Guido's grand vision <wink>>

It may be more important for sequences other than 8-bit strings, as each call
to a comparison function for a pair of non-string sequences is very expensive
(involving more layers of calls for each element comparison).