Sorting: too different times. Why?
Diez B. Roggisch
deets at nospam.web.de
Sun Nov 22 06:06:01 EST 2009
n00m schrieb:
> Any comment:
>
> class Vector:
> def __init__(self, x, y):
> self.x = x
> self.y = y
> def __cmp__(self, v):
> if self.x < v.x and self.y > v.y:
> return -1
> return 0
>
> def v_cmp(v1, v2):
> if v1.x < v2.x and v1.y > v2.y:
> return -1
> return 0
>
> from random import randint
> from time import time
>
> a = []
> for i in range(200000):
Use xrange instead (unless you are under python3), because for loops you
don't need the full list range creates - xrange is just a generator.
> a += [Vector(randint(0, 500000), randint(0, 500000))]
Better use .append here, looks nicer and should also be a bit faster.
> b = a[:]
> c = a[:]
>
> print 'Sorting...'
>
> t = time()
> b.sort(cmp=v_cmp)
> print time() - t
>
> t = time()
> c.sort()
> print time() - t
>
> print b == c
>
>
>
>>>> ===================================== RESTART ======
>>>>
> Sorting...
> 0.906000137329
> 6.57799983025
I think the main reason is that method-dispatch is more expensive than
function-dispatch. The former must create a bound method before calling,
the latter just works out of the box.
Things get better if you do this:
t = time()
c.sort(cmp=Vector.__cmp__)
print time() - t
Although not the exact same performance - I get
Sorting...
0.677843093872
1.4283311367
True
Diez
More information about the Python-list
mailing list