Calculating density based on distance
Hi all,
I have the following problem:
Given a array with dimension Nx3, where N is generally greater than 1.000.000, for each item in this array I have to calculate its density, Where its density is the number of items from the same array with distance less than a given r. The items are the rows from the array.
I was not able to think a solution to this using one or two functions of Numpy. Then I wrote this code http://pastebin.com/iQV0bMNy . The problem it so slow. So I tried to implement it in Cython, here the result http://pastebin.com/zTywzjyM , but it is very slow yet.
Is there a better and faster way of doing that? Is there something in my Cython implementation I can do to perform better?
Thanks!
On Saturday, January 14, 2012, Thiago Franco de Moraes < totonixsame@gmail.com> wrote:
Hi all,
I have the following problem:
Given a array with dimension Nx3, where N is generally greater than 1.000.000, for each item in this array I have to calculate its density, Where its density is the number of items from the same array with distance less than a given r. The items are the rows from the array.
I was not able to think a solution to this using one or two functions of Numpy. Then I wrote this code http://pastebin.com/iQV0bMNy . The problem it so slow. So I tried to implement it in Cython, here the result http://pastebin.com/zTywzjyM , but it is very slow yet.
Is there a better and faster way of doing that? Is there something in my Cython implementation I can do to perform better?
Thanks!
Have you looked at scipy.spatial.KDTree? It can efficiently load up a data structure that lets you easily determine the spatial relationship between datapoints.
Ben Root
No dia Sábado, 14 de Janeiro de 2012, Benjamin Rootben.root@ou.edu escreveu:
On Saturday, January 14, 2012, Thiago Franco de Moraes <
totonixsame@gmail.com> wrote:
Hi all,
I have the following problem:
Given a array with dimension Nx3, where N is generally greater than 1.000.000, for each item in this array I have to calculate its density, Where its density is the number of items from the same array with distance less than a given r. The items are the rows from the array.
I was not able to think a solution to this using one or two functions of Numpy. Then I wrote this code http://pastebin.com/iQV0bMNy . The problem it so slow. So I tried to implement it in Cython, here the result http://pastebin.com/zTywzjyM , but it is very slow yet.
Is there a better and faster way of doing that? Is there something in my Cython implementation I can do to perform better?
Thanks!
Have you looked at scipy.spatial.KDTree? It can efficiently load up a
data structure that lets you easily determine the spatial relationship between datapoints.
Ben Root
Thanks, Ben, I'm going to do that.
Den 14.01.2012 21:52, skrev Thiago Franco de Moraes:
Is there a better and faster way of doing that? Is there something in my Cython implementation I can do to perform better?
You need to use a kdtree to make the computation run in O(n log n) time instead of O(n**2).
scipy.spatial.cKDTree is very fast.
Sturla
participants (4)

Benjamin Root

Sturla Molden

Thiago Franco de Moraes

Thiago Franco Moraes