Hpw make lists that are easy to sort.
Anton Vredegoor
anton.vredegoor at gmail.com
Wed Mar 28 19:12:20 CEST 2007
Python's sorting algorithm takes advantage of preexisting order in a
sequence:
#sort_test.py
import random
import time
def test():
n = 1000
k = 2**28
L = random.sample(xrange(-k,k),n)
R = random.sample(xrange(-k,k),n)
t = time.time()
LR = [(i+j) for i in L for j in R]
print time.time()-t
LR.sort()
print time.time()-t
print
t = time.time()
#L.sort()
R.sort()
presorted_LR = [(i+j) for i in L for j in R]
print time.time()-t
presorted_LR.sort()
print time.time()-t
if __name__=='__main__':
test()
On this -very slow- computer this prints:
>d:\python25\pythonw -u "sort_test.py"
1.10000014305
8.96000003815
1.10000014305
5.49000000954
>Exit code: 0
Presorting the second sequence gains us more than three seconds. I
wonder if there is a way to generate the combined items in such a way
that sorting them is even faster? Is there some other sorting algorithm
that can specifically take advantage of this way -or another way- of
generating this list?
The final sequence is len(L)*len(R) long but it is produced from only
len(L)+len(R) different items, is it possible to exploit this fact? I'd
also be interested in a more general solution that would work for
summing the items of more than two lists in this way.
A.
More information about the Python-list
mailing list