help!: optimization of search problem (example code)

Marcus Stojek stojek at part-gmbh.de
Thu Feb 14 10:12:12 EST 2002


Hi,

after some days of playing around I am not getting
any further and have to ask for some more help.

Here is my problem. I have two large lists ("la" and "lb") of tuples.
Each tuple represents a point in space and has four entries:
x, y, z, pointID

Now I have to find out for each point of list "la" which point of
"lb" is closest. The result should be for all points of "la":

Point 123 of "lb" is the closest one to Point 5932 of "la"
where 123 and 5932 is the pointID.

So here is what I have so far.:

----------------------------------------------------------------
from random import *
from math import *
import bisect
import time


# timer for benchmarking
def startclock():
    d=time.clock()
    return d

def stopwatch(start):
    return time.clock()-start

#generating two arbtitrary lists
seed(0)
a={}
b={}

for i in range(20000):
    a[i]=(100*random(),100*random(),100*random())
    b[i]=(100*random(),100*random(),100*random())

la=[]
lb=[]

for i in range (20000):
    dummy=(a[i][0],a[i][1],a[i][2],i)
    la.append(dummy)
    dummy=(b[i][0],b[i][1],b[i][2],i)
    lb.append(dummy)

lb.sort() #source list

start=startclock()
maxdist=5.0 # I know that there are no two points with a distance > 5

for i in range(500): #should be range(len(la))
#slicing a part out of lb that is closer than maxdist for the first
#coordinate
    lp=(la[i][0]-maxdist,la[i][1],la[i][2],la[i][3])
    rp=(la[i][0]+maxdist,la[i][1],la[i][2],la[i][3])
    left=bisect.bisect_right(lb,lp)
    right=bisect.bisect_left(lb,rp)
    if right==len(lb): right-=1
    lsearch=lb[left:right] # only compare la[i] to this list

    mindist=10000
    minhit=-1 #pointID of closest point

    for j in lsearch:
# i can exclude points with coordinate 2 or 3 further away than
#maxdist
        if abs(j[1]-la[i][1])< maxdist and  
           abs(j[2]-la[i][2]) < maxdist:
#calculate distance:

d=sqrt(pow((j[0]-la[i][0]),2)+pow((j[1]-la[i][1]),2)+pow((j[2]-la[i][2]),2))
            if d < mindist:
                mindist=d
                minhit=j

    if i <= 10 :
   	print la[i],minhit,mindist

bench=stopwatch(start)
print bench, " sec"
--------------------------------------------------------------------------------------------------------

This takes 5.3 sec on my machine for 500 points. That is much too
long. Any idea how I could significanty improve this code?

Tanks,
Marcus



More information about the Python-list mailing list