[Python-Dev] RE: companies data for sorting comparisons
Tim Peters
tim.one@comcast.net
Sun, 28 Jul 2002 13:52:33 -0400
Turns out there was one comparison per merge step I wasn't extracting
maximum value from. Changing the code to suck all I can out of it doesn't
make a measurable difference on sortperf results, except for a tiny
improvement on ~sort on my box, but makes a difference on the Exchange case
of Kevin's data. Here using
values = [(x.get(fieldname), i, x) for i, x in enumerate(data)]
as the list to sort, and times are again in milliseconds:
Sorting on field 'Address' -- 6589 records
via sort: 41.24 41.39 41.41 41.42 86.71
via msort: 42.90 43.01 43.07 43.15 43.75
Sorting on field 'Company' -- 6635 records
via sort: 40.24 40.34 40.42 40.43 42.58
via msort: 30.42 30.45 30.58 30.66 30.66
Sorting on field 'Exchange' -- 6579 records
via sort: 59.64 59.70 59.71 59.72 59.81
via msort: 27.06 27.11 27.19 27.29 27.54
Sorting on field 'NumberOfEmployees' -- 6531 records
via sort: 47.61 47.65 47.73 47.75 47.76
via msort: 48.55 48.57 48.61 48.73 48.92
Sorting on field 'Phone' -- 6589 records
via sort: 48.00 48.03 48.32 48.32 48.39
via msort: 49.60 49.64 49.68 49.79 49.85
Sorting on field 'Profile' -- 6635 records
via sort: 58.63 58.70 58.80 58.85 58.92
via msort: 8.47 8.48 8.51 8.59 8.68
Sorting on field 'Symbol' -- 6635 records
via sort: 39.93 40.13 40.16 40.28 41.37
via msort: 6.20 6.23 6.23 6.43 6.98
Sorting on field 'Web' -- 6632 records
via sort: 46.75 46.77 46.86 46.87 47.05
via msort: 36.44 36.66 36.69 36.69 36.96
'Profile' is slower than the rest for samplesort because the strings it's
comparing are Yahoo URLs with a long common prefix -- the compares just take
longer in that case. I'm not sure why 'Exchange' takes so long for
samplesort (it's a case with lots of duplicate primary keys, but the
distribution is highly skewed, not uniform as in ~sort). In all cases now,
msort is a major-to-killer win, or a small (but real) loss.
I'll upload a new patch and new timsort.txt next. Then I'm taking a week
off! No, I wish it were for fun <wink/sigh>.