[issue9915] speeding up sorting with a key
report at bugs.python.org
Tue Nov 23 00:50:39 CET 2010
Daniel Stutzbach <stutzbach at google.com> added the comment:
I'm starting to get settled in here at Google and finding time to follow up on everything that got put on hold while moving.
Based on the feedback everyone gave me (thanks!), I greatly revised my script for comparing the speed of sort(). It's now compares speeds more quickly and more accurately. I discovered a few things along the way:
1) Don't try to run timings on a virtual server with crummy clock resolution.
2) Make sure that CPU frequency scaling is turned off. It increases the standard deviation of timings by an order of magnitude or two.
On Linux systems, the new version of my script will check to see if CPU frequency scaling is turned on and provide suggestions on how to turn it off. On other operating systems, you are on your own. ;)
filfre:~/py3k-patched$ ./python Tools/sort_speed/sort_speed.py -r 3 --maxn 100000 ../py3k-pristine . sort_random
Using my new-and-improved timing script and running on real hardware with CPU frequency scaling turned off, I was able to detect a slight slowdown on sorting random integers. Here's a summary for sorting random integers:
Original patch (increasing execution time is bad):
n in [700 .. 100,000]: 1% to 2% increase in execution time
n in [70 to 700]: 2% to 3% increase
n in [10 to 70]: 3% to 4% increase
n in [5 to 7]: 7% to 9% decrease
n in [0 to 3]: 18+% decrease
I went back to the code and managed to squeeze out a bit more performance. Here's a summary of the new patch:
New patch (increasing execution time is bad):
n in [30,000 .. 100,000]: < 1% increase in execution time
n in [35 .. 30,000]: 1% to 2% increase
n in [20 .. 35]: about the same
n in [10 .. 20]: 1 to 2% decrease
n in [4 .. 7]: 15+% decrease
n in [0 .. 3]: 22+% decrease
All of the above results are for sorting without a key. For sorting *with* a key, there's a big performance boost across the board (15% to 45% decrease in execution time, when the key function is simply "int").
Overall, the patch shrinks Objects/listobject.c by 10 lines.
- Dramatically cuts execution time of sorting with a key (15% to 45%)
- Significantly cuts execution time of sorting small lists without a key (15+%)
- Very slightly increases execution of sorting large lists without a key (1% to 2%; less than 1% for very large lists)
Added file: http://bugs.python.org/file19779/sort-speed-test.patch
Python tracker <report at bugs.python.org>
More information about the Python-bugs-list