Tunning binary insertion sort algorithm in Timsort.
We can optimize the TimSort algorithm by optimizing its binary insertion sort. The current version of binary insertion sort use this idea: Use binary search to find a final position in sorted list for a new element X. Then insert X to that location. I suggest another idea: Use binary search to find a final postion in sorted list for a new element X. Before insert X to that location, compare X with its next element. For the next element, we already know if it is lower or bigger than X, so we can reduce the search area to the left side or on the right side of X in the sorted list. I have applied my idea on java.util. ComparableTimSort.sort() and testing. The execute time is reduced by 2%6% with array of random integer. Here is detail about algorithm and testing: https://github.com/nhapq/Optimize_binary_insertion_sort Sincerely. phqnha
I suspect that you will find the Python community extremely conservative
about any changes to its sorting algorithm, given that it took thirteen
years and some really impressive automated verification software to find
this bug:
http://envisageproject.eu/provingandroidjavaandpythonsortingalgorithm...
On Sun, Mar 8, 2015 at 5:17 PM, nha pham
We can optimize the TimSort algorithm by optimizing its binary insertion sort.
The current version of binary insertion sort use this idea:
Use binary search to find a final position in sorted list for a new element X. Then insert X to that location.
I suggest another idea:
Use binary search to find a final postion in sorted list for a new element X. Before insert X to that location, compare X with its next element.
For the next element, we already know if it is lower or bigger than X, so we can reduce the search area to the left side or on the right side of X in the sorted list.
I have applied my idea on java.util. ComparableTimSort.sort() and testing. The execute time is reduced by 2%6% with array of random integer.
Here is detail about algorithm and testing: https://github.com/nhapq/Optimize_binary_insertion_sort
Sincerely.
phqnha
_______________________________________________ PythonDev mailing list PythonDev@python.org https://mail.python.org/mailman/listinfo/pythondev Unsubscribe: https://mail.python.org/mailman/options/pythondev/rmsr%40lab.net
On Sun, Mar 08, 2015 at 10:57:30PM 0700, Ryan SmithRoberts wrote:
I suspect that you will find the Python community extremely conservative about any changes to its sorting algorithm, given that it took thirteen years and some really impressive automated verification software to find this bug:
On the other hand, the only person who really needs to be convinced is Tim Peters. It's really not up to the Python community. The bug tracker is the right place for discussing this.  Steve
On Mon, Mar 9, 2015 at 10:53 AM, Steven D'Aprano
On Sun, Mar 08, 2015 at 10:57:30PM 0700, Ryan SmithRoberts wrote:
I suspect that you will find the Python community extremely conservative about any changes to its sorting algorithm, given that it took thirteen years and some really impressive automated verification software to find this bug:
On the other hand, the only person who really needs to be convinced is Tim Peters. It's really not up to the Python community.
Also, there's no sense discouraging people who have ideas to contribute. So what if nha pham's contribution isn't accepted? You never know when the next Tim Peters, Georg Brandl, Mark Dickinson, or Brett Cannon will turn up. (That list is practically endless. Don't feel slighted if I failed to mention you.) With seven billion people out there, a few of them are bound to be pretty smart. Skip
On 3/8/2015 8:17 PM, nha pham wrote:
We can optimize the TimSort algorithm by optimizing its binary insertion sort.
The current version of binary insertion sort use this idea:
Use binary search to find a final position in sorted list for a new element X. Then insert X to that location.
I suggest another idea:
Use binary search to find a final postion in sorted list for a new element X. Before insert X to that location, compare X with its next element.
For the next element, we already know if it is lower or bigger than X, so we can reduce the search area to the left side or on the right side of X in the sorted list.
I have applied my idea on java.util. ComparableTimSort.sort() and testing. The execute time is reduced by 2%6% with array of random integer.
Here is detail about algorithm and testing: https://github.com/nhapq/Optimize_binary_insertion_sort
Apply the idea to timsort in python, and try the different cases discussed in Tim's documentation. If it still looks good, open an issue with the patch and put tim.peters as nosy.  Terry Jan Reedy
participants (5)

nha pham

Ryan SmithRoberts

Skip Montanaro

Steven D'Aprano

Terry Reedy