[Python-Dev] Tunning binary insertion sort algorithm in Timsort.
nha pham
phqnha at gmail.com
Tue Mar 10 06:05:39 CET 2015
Thank you very much. I am very happy that I got a reply from Tim Peter.
You are correct, my mistake.
The python code should be:
for i in range(low+1,high): //because we already add
data[low]
x = data[i]
if x >= data[i-1]:
After I fix it, here is the result:
random array 10^6:
Old binsort: 1.3322
New binsort: 1.0015
ratio: 0.33
You are right, it is not ten times faster anymore. I will update other
results soon.
I do check the result of two sorting methods many times to make sure they
are the same. It is just because I do not know how to put assert into the
timeit.Timer class. I am pretty sure about this.
I will try to write the proof more clearly, sorry for inconvenience.
Thank you very much.
Nha Pham.
On Mon, Mar 9, 2015 at 9:27 PM, Tim Peters <tim.peters at gmail.com> wrote:
> [nha pham <phqnha at gmail.com>]
> > Statement_1: With an array of size N or less than N, we need at most
> log2(N)
> > comparisons to find a value
> > (or a position, incase the search miss), using the binary search
> algorithm.
> >
> > proof: This statement is trivia, and I believe, someone outthere already
> > proved it.
>
> Sorry for the quick message here. It's just a simple point where it
> will pay not to get off on a wrong foot ;-)
>
> Correct: for an array of size N, binary search can require as many as
> ceiling(log2(N+1)) comparisons.
>
> That's because there are N+1 possible results for an array of size N.
> For example, for an array of size 3, [A, B, C], "the answer" may be
> "before A", "between A and B", "between B and C", or "after C". 3
> elements, 3+1 = 4 possible results. log2(3) comparisons are not
> enough to distinguish among 4 results.
>
> Make it trivial, an array of length 1. Then 1 comparison is obviously
> necessary and sufficient in all cases. And, indeed,
> ceiling(log2(1+1)) = 1. log2(1) equals 0, too small.
>
> For the rest, I haven't been able to understand your description or
> your pseudo-code. I'll try harder. Some things clearly aren't doing
> what you _intend_ them to do. For example, in your Python code, each
> time through the outer loop you're apparently trying to sort the next
> CHUNK elements, but you end up appending CHUNK+1 values to data2 (or
> data3).
>
> Or in this part:
>
> for i in range(low,high):
> x = data[i]
> if x >= data[i-1]:
>
> the first time that loop is executed low == 0, and so i == 0 on the
> first iteration, and so the conditional is
>
> if x >= data[0-1]
>
> That's referencing data[-1], which is the very last element in data -
> which has nothing to do with the CHUNK you're trying to sort at the
> time.
>
> So there are a number of errors here, which makes it that much harder
> to sort out (pun intended <wink>) what you're trying to do. It would
> help you to add some asserts to verify your code is doing what you
> _hope_ it's doing. For example, add
>
> assert data2[low: high] == sorted(data[low: high])
> assert len(data2) == high
>
> to the end of your `sample` loop, and similarly for data3 in your
> `new` loop. Until those asserts all pass, you're not testing code
> that's actually sorting correctly. Repair the errors and you almost
> certainly won't find `new` running over 10 times faster than `sample`
> anymore. I don't know what you _will_ discover, though. If the code
> doesn't have to sort correctly, there are much simpler ways to make it
> run _very_ much faster ;-)
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20150309/23d7bb73/attachment.html>
More information about the Python-Dev
mailing list