Optimize evaluation of function on matrix
![](https://secure.gravatar.com/avatar/c1046cc960f1d4fb1747e51c953eb1db.jpg?s=120&d=mm&r=g)
Hello, I have this function: def eval_BF(self, meshA, meshB): """ Evaluates single BF or list of BFs on the meshes. """ if type(self.basisfunction) is list: A = np.empty((len(meshA), len(meshB))) for i, row in enumerate(meshA): for j, col in enumerate(meshB): A[i, j] = self.basisfunction[j](row - col) else: mgrid = np.meshgrid(meshB, meshA) A = self.basisfunction( np.abs(mgrid[0] - mgrid[1]) ) return A meshA and meshB are 1-dimensional numpy arrays. self.basisfunction is e.g. def Gaussian(radius, shape): """ Gaussian Basis Function """ return np.exp( -np.power(shape*abs(radius), 2)) or a list of partial instantations of such functions (from functools.partial). How can I optimize eval_BF? Esp. in the case of basisfunction being a list. Thanks! Florian
![](https://secure.gravatar.com/avatar/b4f6d4f8b501cb05fd054944a166a121.jpg?s=120&d=mm&r=g)
On Sat, 2017-03-25 at 18:46 +0100, Florian Lindner wrote:
Are you sure you need to optimize it? If they have a couple of hundred elements or so for each row, the math is probably the problem and most of that might be the `exp`. You can get rid of the `row` loop though in case row if an individual row is a pretty small array. To be honest, I am a bit surprised that its a problem, since "basis function" sounds a bit like you have to do this once and then use the result many times. - Sebastian
![](https://secure.gravatar.com/avatar/c1046cc960f1d4fb1747e51c953eb1db.jpg?s=120&d=mm&r=g)
Hey, I've timed the two versions, one basisfunction being a function: 1 loop, best of 3: 17.3 s per loop the other one, basisfunction being a list of functions: 1 loop, best of 3: 33.5 s per loop
It's part of a radial basis function interpolation algorithm. Yes, in practice the matrix is filled only once and reused a couple of times, but in my case, which is exploration of parameters for the algorithm, I call eval_BF many times.
You can get rid of the `row` loop though in case row if an individual row is a pretty small array.
Would you elaborate on that? Do you mean that the inner col loop produces an array which is then assigned to the row. But I think it stell need to row loop there. Best, Florian Am 25.03.2017 um 22:31 schrieb Sebastian Berg:
![](https://secure.gravatar.com/avatar/b4f6d4f8b501cb05fd054944a166a121.jpg?s=120&d=mm&r=g)
On Mon, 2017-03-27 at 13:06 +0200, Florian Lindner wrote:
Well, I like to not serve the result, but if you exchange the loops: A = np.empty((len(meshA), len(meshB))) for j, col in enumerate(meshB): for i, row in enumerate(meshA): A[i, j] = self.basisfunction[j](row - col) Then you can see that there is broadcasting magic similar (do not want to use too many brain cells now) to: A = np.empty((len(meshA), len(meshB))) for j, col in enumerate(meshB): # possibly insert np.newaxis/None or a reshape in [??] A[:, j] = self.basisfunction[j](meshA[??] - col) - Sebastian
![](https://secure.gravatar.com/avatar/5e44e61af952612c3c45385b62806c67.jpg?s=120&d=mm&r=g)
The best way to get good optimized code is to find it. Does this do what you want? https://docs.scipy.org/doc/scipy/reference/generated/scipy.interpolate.Rbf.h... For some advice here, first avoid loops in Python and instead stick to ufuncs and broadcasting. It looks like the matrix is symmetric with constant diagonals, so much of the work can possibly be avoided. Finally consider threading over blocks or different basis functions. To dial it up to 11 write low level c code to do exactly what you want, writing a custom ufunc or call it cython. On Mar 25, 2017 2:03 PM, "Florian Lindner" <mailinglists@xgm.de> wrote:
![](https://secure.gravatar.com/avatar/b4f6d4f8b501cb05fd054944a166a121.jpg?s=120&d=mm&r=g)
On Sat, 2017-03-25 at 18:46 +0100, Florian Lindner wrote:
Are you sure you need to optimize it? If they have a couple of hundred elements or so for each row, the math is probably the problem and most of that might be the `exp`. You can get rid of the `row` loop though in case row if an individual row is a pretty small array. To be honest, I am a bit surprised that its a problem, since "basis function" sounds a bit like you have to do this once and then use the result many times. - Sebastian
![](https://secure.gravatar.com/avatar/c1046cc960f1d4fb1747e51c953eb1db.jpg?s=120&d=mm&r=g)
Hey, I've timed the two versions, one basisfunction being a function: 1 loop, best of 3: 17.3 s per loop the other one, basisfunction being a list of functions: 1 loop, best of 3: 33.5 s per loop
It's part of a radial basis function interpolation algorithm. Yes, in practice the matrix is filled only once and reused a couple of times, but in my case, which is exploration of parameters for the algorithm, I call eval_BF many times.
You can get rid of the `row` loop though in case row if an individual row is a pretty small array.
Would you elaborate on that? Do you mean that the inner col loop produces an array which is then assigned to the row. But I think it stell need to row loop there. Best, Florian Am 25.03.2017 um 22:31 schrieb Sebastian Berg:
![](https://secure.gravatar.com/avatar/b4f6d4f8b501cb05fd054944a166a121.jpg?s=120&d=mm&r=g)
On Mon, 2017-03-27 at 13:06 +0200, Florian Lindner wrote:
Well, I like to not serve the result, but if you exchange the loops: A = np.empty((len(meshA), len(meshB))) for j, col in enumerate(meshB): for i, row in enumerate(meshA): A[i, j] = self.basisfunction[j](row - col) Then you can see that there is broadcasting magic similar (do not want to use too many brain cells now) to: A = np.empty((len(meshA), len(meshB))) for j, col in enumerate(meshB): # possibly insert np.newaxis/None or a reshape in [??] A[:, j] = self.basisfunction[j](meshA[??] - col) - Sebastian
![](https://secure.gravatar.com/avatar/5e44e61af952612c3c45385b62806c67.jpg?s=120&d=mm&r=g)
The best way to get good optimized code is to find it. Does this do what you want? https://docs.scipy.org/doc/scipy/reference/generated/scipy.interpolate.Rbf.h... For some advice here, first avoid loops in Python and instead stick to ufuncs and broadcasting. It looks like the matrix is symmetric with constant diagonals, so much of the work can possibly be avoided. Finally consider threading over blocks or different basis functions. To dial it up to 11 write low level c code to do exactly what you want, writing a custom ufunc or call it cython. On Mar 25, 2017 2:03 PM, "Florian Lindner" <mailinglists@xgm.de> wrote:
participants (3)
-
Florian Lindner
-
Matthew Harrigan
-
Sebastian Berg