[Python-Dev] Comparison speed

Martin v. Loewis martin@loewis.home.cs.tu-berlin.de
Tue, 15 May 2001 21:45:59 +0200


> more-measuring-less-guessing-ly y'rs  - tim

Producing numbers is easy :-) I've instrumented my version where
string implements richcmp, and special-cases everything I can think
of. Counting is done for running the test suite. With this, I get

Calls to string_richcompare:   2378660
Calls with different types:      33992 (ie. one is not a string)
Calls with identical strings:   120517
Calls where lens decide !EQ:   1775716
----------------------------
Calls richcmp -> oldcomp:       448435
Total calls to oldcomp:        1225643
Calls oldcomp -> memcmp:        860174

So 5% of the calls are with identical strings, for which I can
immediately decide the outcome. 75% can be decided in terms of the
string lengths, which leaves ca. 19% for cases where lexicographical
comparison is needed.

In those cases, the first byte decides in 30%. If I remove the test
for "len decides !EQ", I get

#riches:                       2358322
#riches_ni:                      34108
#idents_decide:                 102050
#lens_decide:                        0
--------------------------------------
rest(computed):                2222164
#comps:                        2949421
#memcmps:                       917776

So still, ca. 30% can be decided by first byte. It still appears that
the total number of calls to memcmp is higher when the length is not
taken into consideration. To verify this claim, I've counted the cases
where the length decides the outcome, but looking at the first byte
also had:

lens_decide:                    1784897
lens_decide_firstbyte_wouldhave:1671148

So in 6% of the cases, checking the length alone gives a decision
which looking at the first byte doesn't; plus it saves a function
call.

To support the thesis that Py_EQ is the common case for strings, I
counted the various operations:

pyEQ:2271593
pyLE:9234
pyGE:0
pyNE:20470
pyLT:22765
pyGT:578

Now, that might be flawed since comparing strings for equal is
extremely frequent in the testsuite. To give more credibility to the
data, I also ran setup.py with my instrumented ./python:

riches:21640
riches_ni:76
riches_ni1:0
idents:2885
idents_decide:2885
lens_decide:9472
lens_decide_firstbyte_wouldhave:6223
comps:26360
memcmps:19224
pyEQ:20093
pyLE:46
pyGE:1
pyNE:548
pyLT:876
pyGT:0                                                                          
That shows that optimizing for Py_NE is not worth it. With these data,
I'll upload a patch to SF.

Regards,
Martin