Hpw make lists that are easy to sort.

Anton Vredegoor anton.vredegoor at gmail.com
Wed Mar 28 19:12:20 CEST 2007

```Python's sorting algorithm takes advantage of preexisting order in a
sequence:

#sort_test.py
import random
import time

def test():
n = 1000
k = 2**28

L = random.sample(xrange(-k,k),n)
R = random.sample(xrange(-k,k),n)

t = time.time()
LR = [(i+j) for i in L for j in R]
print time.time()-t
LR.sort()
print time.time()-t

print

t = time.time()
#L.sort()
R.sort()
presorted_LR = [(i+j) for i in L for j in R]
print time.time()-t
presorted_LR.sort()
print time.time()-t

if __name__=='__main__':
test()

On this -very slow- computer this prints:

>d:\python25\pythonw -u "sort_test.py"
1.10000014305
8.96000003815

1.10000014305
5.49000000954
>Exit code: 0

Presorting the second sequence gains us more than three seconds. I
wonder if there is a way to generate the combined items in such a way
that sorting them is even faster? Is there some other sorting algorithm
that can specifically take advantage of this way -or another way- of
generating this list?

The final sequence is len(L)*len(R) long but it is produced from only
len(L)+len(R) different items, is it possible to exploit this fact? I'd
also be interested in a more general solution that would work for
summing the items of more than two lists in this way.

A.

```