[pypy-dev] Sorting structs

Tuom Larsen tuom.larsen at gmail.com
Mon Oct 3 09:53:32 EDT 2016


Thanks for the reply!

I don't think a reverse counter is better because TimSort (used in
PyPy?) is able to recognise the pattern [1] and do better than other
kinds of sorts. And almost for a year [2] now, V8 implements random as
XorShift128+ which very good. I also tried different sizes and the
timings still show similar difference.

[1] http://c2.com/cgi/wiki?TimSort
[2] https://bugs.chromium.org/p/chromium/issues/detail?id=559024

On Mon, Oct 3, 2016 at 10:46 AM, Leonardo Santagada <santagada at gmail.com> wrote:
> Can you start by having a stable benchmark? Instead of calling random have a
> reverse counter to create your list in reverse order. That will at least get
> somewhat similar work on both languages (javascript random is notoriously
> bad at being random).
>
> Maybe the difference in performance is all from the difference in the
> distribution of the numbers. Also try bigger/smaller lists to see how they
> behave. (also warming up both jit engines might be necessary)
>
>
> On Mon, Sep 26, 2016 at 10:57 PM, Tuom Larsen <tuom.larsen at gmail.com> wrote:
>>
>> Dear list!
>>
>> I stumbled upon a problem and I was wondering if someone would by so
>> kind and explain to me what is going on.
>>
>> The problem is sorting 2 000 000 points by `x` coordinate:
>>
>>     from random import random
>>     from time import time
>>
>>     class point(object):
>>         def __init__(self, x, y):
>>             self.x, self.y = x, y
>>
>>     data = [point(random(), random()) for i in range(2000000)]
>>
>>     t = time()
>>     data.sort(key=lambda p:p.x)
>>     print time() - t
>>
>> on my machine in runs 8.74s under PyPy 5.4.1 on MacOS. I then try to
>> sort the points in JavaScript:
>>
>>     var data = [];
>>     for (var i=0; i<2000000; i++) {
>>         data.push({x:Math.random(), y:Math.random()});
>>     }
>>
>>     console.time('sorting');
>>     data.sort(function(a, b) { return a.x - b.x; });
>>     console.timeEnd('sorting');
>>
>> and it runs in 3.09s under V8 5.3.
>>
>> I was just wondering, why is it nearly 3x slower under PyPy than it is
>> under V8? Is there any way I could make the code run faster?
>>
>> Thank you in advance!
>> _______________________________________________
>> pypy-dev mailing list
>> pypy-dev at python.org
>> https://mail.python.org/mailman/listinfo/pypy-dev
>
>
>
>
> --
>
> Leonardo Santagada


More information about the pypy-dev mailing list