[Python-Dev] Comparison speed

Martin v. Loewis martin@loewis.home.cs.tu-berlin.de
Tue, 15 May 2001 22:59:56 +0200


> Of course.  But except for instance objects, answering "does the type
> implement tp_richcompare?" is one lousy pointer check

Almost - you also have to check the type flag.

> and the answer will usually be-- provided we don't start stuffing
> code into *every* object's tp_richcompare slot! --"no, so I can go
> to tp_compare immediately".  Coercions and richcmps are the oddball
> cases today.

I'd like to add another data point, answering the question what types
are most frequently compared. The first set of data is for running the
Python testsuite.

riches      3040952  # Calls to PyType_RichCompare
eqs         2828345  # Calls where the types are equal

String      2323122
Float        141507
Int          125187
Type          99477
Tuple         84503
Long          30325
Unicode       10782
Instance       9335
List           2997
None            383
Class           318
Complex         219
Dict             57
Array            49
WeakRef          34
Function         11
File             11
SRE_Pattern      10
CFunction         9
Lock              8
Module            1

So strings cover 82% of all the compare calls of equally-typed
objects, followed by floats with 5%. Those calls together cover 93% of
the richcompare calls.

Since this might give a blurred view of what is actually used in
applications, I ran the PyXML testsuite with that python binary
also. Leaving out types that are not used, I get

riches        88465
eqs           59279

String        48097
Int            5681
Type           3170
Tuple           760
List            492
Float           332
Instance        269
Unicode         243
None            225
SRE_Pattern       4
Long              3
Complex           3

The first observation here is that "only" 67% of the calls are with
equally-typed objects. Of those, 80% are with strings, 9% with
integers.

The last example is idle, where I just did an "import httplib", for
fun.

riches        50923
eqs           49882

String        31198
Tuple          8312
Type           7978
Int            1456
None            600
SRE_Pattern     210
List            122
Instance          4
Float             1
Instance method   1

Roughly the same picture: 97% calls with equally-typed objects, of
those 62% strings, 3% integers. Notice the 15% for tuples and types,
each.

So to speed-up the common case clearly means to speed-up string
comparisons. If I'd need to optimize anything else afterwards, I'd
look into type objects - most likely, they are compared for EQ, which
can be done nicely and directly in a tp_richcompare also.

Those two optimizations together would give a richcompare to 95% of
the objects in the IDLE case.

Regards,
Martin