[pypy-dev] Sorting structs

Carl Friedrich Bolz cfbolz at gmx.de
Mon Oct 3 13:28:09 EDT 2016


Hi Tuom,

The problem is the key=... argument to sort, which isn't optimized
well. This kind of code is much faster:

    wrapped_data = [(p.x, p) for p in data]
    wrapped_data.sort()
    data = [it[1] for it in wrapped_data]

We should fix this problem. I've created an issue so it doesn't get
lost:

https://bitbucket.org/pypy/pypy/issues/2410/listsort-key-is-slow

Thanks for the report!

Cheers,

Carl Friedrich


On 26/09/16 22:57, Tuom Larsen wrote:
> Dear list!
> 
> I stumbled upon a problem and I was wondering if someone would by so
> kind and explain to me what is going on.
> 
> The problem is sorting 2 000 000 points by `x` coordinate:
> 
>     from random import random
>     from time import time
> 
>     class point(object):
>         def __init__(self, x, y):
>             self.x, self.y = x, y
> 
>     data = [point(random(), random()) for i in range(2000000)]
> 
>     t = time()
>     data.sort(key=lambda p:p.x)
>     print time() - t
> 
> on my machine in runs 8.74s under PyPy 5.4.1 on MacOS. I then try to
> sort the points in JavaScript:
> 
>     var data = [];
>     for (var i=0; i<2000000; i++) {
>         data.push({x:Math.random(), y:Math.random()});
>     }
> 
>     console.time('sorting');
>     data.sort(function(a, b) { return a.x - b.x; });
>     console.timeEnd('sorting');
> 
> and it runs in 3.09s under V8 5.3.
> 
> I was just wondering, why is it nearly 3x slower under PyPy than it is
> under V8? Is there any way I could make the code run faster?
> 
> Thank you in advance!
> _______________________________________________
> pypy-dev mailing list
> pypy-dev at python.org
> https://mail.python.org/mailman/listinfo/pypy-dev
> 



More information about the pypy-dev mailing list