[Python-Dev] Tunning binary insertion sort algorithm in Timsort.

Tue Mar 10 06:05:39 CET 2015

Thank you very much. I am very happy that I got a reply from Tim Peter.

You are correct, my mistake.

The python code should be:
    for i in range(low+1,high):              //because we already add
data[low]
        x = data[i]
        if x >= data[i-1]:

After I fix it, here is the result:

random array 10^6:
Old binsort:  1.3322
New binsort: 1.0015
ratio: 0.33

You are right, it is not ten times faster anymore. I will update other
results soon.

I do check the result of two sorting methods many times to make sure they
are the same. It is just because I do not know how to put assert into the
timeit.Timer class. I am pretty sure about this.

I will try to write the proof more clearly, sorry for inconvenience.

Thank you very much.
Nha Pham.

On Mon, Mar 9, 2015 at 9:27 PM, Tim Peters <tim.peters at gmail.com> wrote:

> [nha pham <phqnha at gmail.com>]
> > Statement_1: With an array of size N or less than N, we need at most
> log2(N)
> > comparisons to find a value
> > (or a position, incase the search miss), using the binary search
> algorithm.
> >
> > proof: This statement is trivia, and I believe, someone outthere already
> > proved it.
>
> Sorry for the quick message here.  It's just a simple point where it
> will pay not to get off on a wrong foot ;-)
>
> Correct:  for an array of size N, binary search can require as many as
> ceiling(log2(N+1)) comparisons.
>
> That's because there are N+1 possible results for an array of size N.
> For example, for an array of size 3, [A, B, C], "the answer" may be
> "before A", "between A and B", "between B and C", or "after C".  3
> elements, 3+1 = 4 possible results.  log2(3) comparisons are not
> enough to distinguish among 4 results.
>
> Make it trivial, an array of length 1.  Then 1 comparison is obviously
> necessary and sufficient in all cases.  And, indeed,
> ceiling(log2(1+1)) = 1.  log2(1) equals 0, too small.
>
> For the rest, I haven't been able to understand your description or
> your pseudo-code.  I'll try harder.  Some things clearly aren't doing
> what you _intend_ them to do.  For example, in your Python code, each
> time through the outer loop you're apparently trying to sort the next
> CHUNK elements, but you end up appending CHUNK+1 values to data2 (or
> data3).
>
> Or in this part:
>
>     for i in range(low,high):
>         x = data[i]
>         if x >= data[i-1]:
>
> the first time that loop is executed low == 0, and so i == 0 on the
> first iteration, and so the conditional is
>
>        if x >= data[0-1]
>
> That's referencing data[-1], which is the very last element in data -
> which has nothing to do with the CHUNK you're trying to sort at the
> time.
>
> So there are a number of errors here, which makes it that much harder
> to sort out (pun intended <wink>) what you're trying to do.  It would
> help you to add some asserts to verify your code is doing what you
> _hope_ it's doing.  For example, add
>
>     assert data2[low: high] == sorted(data[low: high])
>     assert len(data2) == high
>
> to the end of your `sample` loop, and similarly for data3 in your
> `new` loop.  Until those asserts all pass, you're not testing code
> that's actually sorting correctly.  Repair the errors and you almost
> certainly won't find `new` running over 10 times faster than `sample`
> anymore.  I don't know what you _will_ discover, though.  If the code
> doesn't have to sort correctly, there are much simpler ways to make it
> run _very_ much faster ;-)
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20150309/23d7bb73/attachment.html>