Fast function application on list of 2D points?

Hello, What is the fastest way of applying a function on a list of 2D points? More specifically, I have a list of 2D points, and some do not meet some criteria and must be rejected. Even more specifically, the filter only lets through points whose x coordinate satisfies some condition, _and_ whose y coordinates satisfies another condition (maybe is there room for optimization, here?). Currently, I use points = numpy.apply_along_axis(filter, axis = 1, arr = points) but this creates a bottleneck in my program (array arr may contains 1 million points, for instance). Is there anything that could be faster? Any suggestion would be much appreciated! EOL

Why you don't create a mask to select only the points in array that satisfies the condition on x and y coordinates. For example the code below applies filter only to the values that have x coordinate bigger than 0.7 and y coordinate smaller than 0.3: mask = numpy.logical_and(points[:,0] > 0.7, points[:,1] < 0.3) points = numpy.apply_along_axis(filter, axis = 1, arr = points[mask,:]) best, Paulo Em Seg, 2009-01-12 às 15:21 +0100, Eric LEBIGOT escreveu:
Hello,
What is the fastest way of applying a function on a list of 2D points? More specifically, I have a list of 2D points, and some do not meet some criteria and must be rejected. Even more specifically, the filter only lets through points whose x coordinate satisfies some condition, _and_ whose y coordinates satisfies another condition (maybe is there room for optimization, here?).
Currently, I use
points = numpy.apply_along_axis(filter, axis = 1, arr = points)
but this creates a bottleneck in my program (array arr may contains 1 million points, for instance).
Is there anything that could be faster?
Any suggestion would be much appreciated!
EOL
_______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion

Eric, You question raised my attention due to a recent post of mine related to the same kind of problem. I was solving it without using apply_along_axis (due to ignorance). However I tried to use apply_along_axis to solve my problem and it proved to be very slow. Try the following: ----------- import numpy as np import time def filter(x): return x.sum() a = np.random.random((2, 1000000)) # Apply filter to all points, version 1 t = time.clock() sums1 = np.apply_along_axis(filter, axis=0, arr=a) print 'Elapsed time', time.clock() - t # Apply filter to all points, version 2 t = time.clock() sums2 = np.array([filter(p) for p in a.T]) print 'Elapsed time', time.clock() - t print sums1 == sums2 ------------ In my computer the first version takes more than 6.5 longer than the second. However the version 2 is using list comprehensions instead of a numpy function. I would expected it to be slower. It looks like apply_along_axis is creating many temporary arrays. Eric, it looks like you should try something along the second version above and see if it is faster in your case too. Paulo
participants (2)
-
Eric LEBIGOT
-
Paulo J. S. Silva