Hi, I need to avoid (at least) two inner for loops in what I am trying to do otherwise my processing takes forever. What is the best way to transfer what I am doing into a more "numpy way"? Essentially I am trying to call a model again for various different parameter combinations. The example is fictional, by the grid_size would ideally grow > 500 and by doing so the processing speed becomes very slow the way I have set things up.. thanks. example. import numpy as np def fake_model(data1, data2, p1, p2, p3): """ complete nonsense """ return data1 + data2 * p1 * p2 * p3 data1 = np.random.rand(10) # the size of this arrays varies might be 10 might be 15 etc data2 = np.random.rand(10) # the size of this arrays varies might be 10 might be 15 etc obs = np.random.rand(10) # the size of this arrays varies might be 10 might be 15 etc grid_size = 10 # Ideally this would be a large number param1 = np.linspace(5.0, 350, grid_size) param2 = np.linspace(5.0, 550, grid_size) param3 = np.linspace(1E-8, 10.5, grid_size) ss = np.zeros(0) for p1 in param1: for p2 in param2: for p3 in param3: ans = fake_model(data1, data2, p1, p2, p3) ss = np.append(ss, np.sum(obs - ans)**2) print np.sum(obs - ans)**2
Hi Martin, It looks like you are trying to do a brute force minimization of your model with three parameters, is that right? For that you could look at scipy.optimize.brute, which will probably be faster. scipy.optimize also has several more sophisticated minimizers that you might consider: http://docs.scipy.org/doc/scipy/reference/optimize.html If your function works with numpy arrays, you could try generating all of your parameters at once with mgrid and calling your function on the returned arrays, p1, p2, p3 = np.mgrid[:10,:10,:10] David On 08/19/2012 04:07 AM, Martin De Kauwe wrote:
Hi,
I need to avoid (at least) two inner for loops in what I am trying to do otherwise my processing takes forever. What is the best way to transfer what I am doing into a more "numpy way"? Essentially I am trying to call a model again for various different parameter combinations. The example is fictional, by the grid_size would ideally grow > 500 and by doing so the processing speed becomes very slow the way I have set things up..
thanks.
example.
import numpy as np
def fake_model(data1, data2, p1, p2, p3): """ complete nonsense """ return data1 + data2 * p1 * p2 * p3
data1 = np.random.rand(10) # the size of this arrays varies might be 10 might be 15 etc data2 = np.random.rand(10) # the size of this arrays varies might be 10 might be 15 etc obs = np.random.rand(10) # the size of this arrays varies might be 10 might be 15 etc
grid_size = 10 # Ideally this would be a large number param1 = np.linspace(5.0, 350, grid_size) param2 = np.linspace(5.0, 550, grid_size) param3 = np.linspace(1E-8, 10.5, grid_size) ss = np.zeros(0)
for p1 in param1: for p2 in param2: for p3 in param3: ans = fake_model(data1, data2, p1, p2, p3)
ss = np.append(ss, np.sum(obs - ans)**2) print np.sum(obs - ans)**2
Hi Apologies I didn't see this response as I was viewing on google groups and on there it looks like no one has responded!! Odd. Sort of. I already have a minimisation script set up using the lmfit package. I was trying to sample the parameter space to find a decent starting point for my initial parameter guess. J. David Lee wrote:
Hi Martin,
It looks like you are trying to do a brute force minimization of your model with three parameters, is that right? For that you could look at scipy.optimize.brute, which will probably be faster. scipy.optimize also has several more sophisticated minimizers that you might consider:
http://docs.scipy.org/doc/scipy/reference/optimize.html
If your function works with numpy arrays, you could try generating all of your parameters at once with mgrid and calling your function on the returned arrays,
p1, p2, p3 = np.mgrid[:10,:10,:10]
David
On 08/19/2012 04:07 AM, Martin De Kauwe wrote:
Hi,
I need to avoid (at least) two inner for loops in what I am trying to do otherwise my processing takes forever. What is the best way to transfer what I am doing into a more "numpy way"? Essentially I am trying to call a model again for various different parameter combinations. The example is fictional, by the grid_size would ideally grow > 500 and by doing so the processing speed becomes very slow the way I have set things up..
thanks.
example.
import numpy as np
def fake_model(data1, data2, p1, p2, p3): """ complete nonsense """ return data1 + data2 * p1 * p2 * p3
data1 = np.random.rand(10) # the size of this arrays varies might be 10 might be 15 etc data2 = np.random.rand(10) # the size of this arrays varies might be 10 might be 15 etc obs = np.random.rand(10) # the size of this arrays varies might be 10 might be 15 etc
grid_size = 10 # Ideally this would be a large number param1 = np.linspace(5.0, 350, grid_size) param2 = np.linspace(5.0, 550, grid_size) param3 = np.linspace(1E-8, 10.5, grid_size) ss = np.zeros(0)
for p1 in param1: for p2 in param2: for p3 in param3: ans = fake_model(data1, data2, p1, p2, p3)
ss = np.append(ss, np.sum(obs - ans)**2) print np.sum(obs - ans)**2
_______________________________________________ SciPy-User mailing list SciPy-User@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user
-- View this message in context: http://old.nabble.com/Avoiding-inner-for-loops---tp34319763p34330080.html Sent from the Scipy-User mailing list archive at Nabble.com.
Hi,
I need to avoid (at least) two inner for loops in what I am trying to do otherwise my processing takes forever. What is the best way to transfer what I am doing into a more "numpy way"? Essentially I am trying to call a model again for various different parameter combinations. The example is fictional, by the grid_size would ideally grow > 500 and by doing so the processing speed becomes very slow the way I have set things up..
thanks.
example.
import numpy as np
def fake_model(data1, data2, p1, p2, p3): """ complete nonsense """ return data1 + data2 * p1 * p2 * p3
data1 = np.random.rand(10) # the size of this arrays varies might be 10 might be 15 etc data2 = np.random.rand(10) # the size of this arrays varies might be 10 might be 15 etc obs = np.random.rand(10) # the size of this arrays varies might be 10 might be 15 etc
grid_size = 10 # Ideally this would be a large number param1 = np.linspace(5.0, 350, grid_size) param2 = np.linspace(5.0, 550, grid_size) param3 = np.linspace(1E-8, 10.5, grid_size) ss = np.zeros(0)
for p1 in param1: for p2 in param2: for p3 in param3: ans = fake_model(data1, data2, p1, p2, p3)
ss = np.append(ss, np.sum(obs - ans)**2) print np.sum(obs - ans)**2
You may like to utilize functionality of ix_ and newaxis, like: In []: p1, p2, p3= ix_(param1, param2, param3) In []: n_= newaxis In []: ans= fake_model(data1[:, n_, n_, n_], data2[:, n_, n_, n_], p1, p2,
Hi, On Sun, Aug 19, 2012 at 12:07 PM, Martin De Kauwe <mdekauwe@gmail.com>wrote: p3) In []: ss2= ((obs[:, n_, n_, n_]- ans).sum(0)** 2).reshape(grid_size** 3) In []: allclose(ss, ss2) Out[]: True My 2 cents, -eat
_______________________________________________ SciPy-User mailing list SciPy-User@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user
You may like to utilize functionality of ix_ and newaxis, like: In []: p1, p2, p3= ix_(param1, param2, param3) In []: n_= newaxis In []: ans= fake_model(data1[:, n_, n_, n_], data2[:, n_, n_, n_], p1, p2,
Hi, p3) In []: ss2= ((obs[:, n_, n_, n_]- ans).sum(0)** 2).reshape(grid_size** 3) In []: allclose(ss, ss2) Out[]: True My 2 cents, -eat This solution is exactly what I was looking for thank you. I hadn't come across np.ix_ before. -- View this message in context: http://old.nabble.com/Avoiding-inner-for-loops---tp34319763p34332203.html Sent from the Scipy-User mailing list archive at Nabble.com.
Actually one more thing I can't figure out, is I want to use the smallest difference, ie. from ss2 to get the relevant paramater values. e.g. ss2= ((obs[:, n_, n_, n_]- ans).sum(0)** 2).reshape(grid_size** 3) index = np.argmin(ss2, 0) but this index will be from the full grid (shape=8000) and not give me something I can pull the values from the param1, param2, param3 arrays (shape=20)? thanks again. -- View this message in context: http://old.nabble.com/Avoiding-inner-for-loops---tp34319763p34332296.html Sent from the Scipy-User mailing list archive at Nabble.com.
Hi, On Wed, Aug 22, 2012 at 3:52 AM, mdekauwe <mdekauwe@gmail.com> wrote:
Actually one more thing I can't figure out, is I want to use the smallest difference, ie. from ss2 to get the relevant paramater values.
e.g.
ss2= ((obs[:, n_, n_, n_]- ans).sum(0)** 2).reshape(grid_size** 3) index = np.argmin(ss2, 0)
but this index will be from the full grid (shape=8000) and not give me something I can pull the values from the param1, param2, param3 arrays (shape=20)?
Perhaps In []: ss= ((obs[:, n_, n_, n_]- ans).sum(0)** 2) In []: ndx= where(ss.min()== ss) In []: ndx Out[]: (array([0]), array([0]), array([0])) In []: ss[ndx] Out[]: array([ 2500.]) Regards, -eat
thanks again. -- View this message in context: http://old.nabble.com/Avoiding-inner-for-loops---tp34319763p34332296.html Sent from the Scipy-User mailing list archive at Nabble.com.
_______________________________________________ SciPy-User mailing list SciPy-User@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user
Perhaps simplifying, in 2D this is what I want if using loops def fake_model(data1, data2, p1, p2): """ complete nonsense """ return data1 + data2 * p1 * p2 grid_size = 5 nobs = 5 obs = np.zeros(nobs) data1 = np.arange(nobs) data2 = np.arange(nobs) a = np.arange(grid_size) b = np.arange(grid_size) c = np.arange(grid_size) ss = np.zeros(0) for p1 in a: for p2 in b: ans = fake_model(data1, data2, p1, p2) #ss = np.append(ss, np.sum(obs - ans)**2) print ans which would produce [0 1 2 3 4] [0 1 2 3 4] [0 1 2 3 4] [0 1 2 3 4] [0 1 2 3 4] [0 1 2 3 4] [0 2 4 6 8] . snip . [0 1 2 3 4] [ 0 5 10 15 20] [ 0 9 18 27 36] [ 0 13 26 39 52] [ 0 17 34 51 68] And so I figured something like... a = np.ones((grid_size,grid_size)) * np.arange(grid_size)[None,:] b = np.ones((grid_size,grid_size)) * np.arange(grid_size)[:,None] ans = fake_model(data1, data2, a, b) Although this doesn't seem to work, but I think this might be along the right lines? This produces [[ 0. 1. 2. 3. 4.] [ 0. 2. 6. 12. 20.] [ 0. 3. 10. 21. 36.] [ 0. 4. 14. 30. 52.] [ 0. 5. 18. 39. 68.]] On Sunday, August 19, 2012 7:07:59 PM UTC+10, Martin De Kauwe wrote:
Hi,
I need to avoid (at least) two inner for loops in what I am trying to do otherwise my processing takes forever. What is the best way to transfer what I am doing into a more "numpy way"? Essentially I am trying to call a model again for various different parameter combinations. The example is fictional, by the grid_size would ideally grow > 500 and by doing so the processing speed becomes very slow the way I have set things up..
thanks.
example.
import numpy as np
def fake_model(data1, data2, p1, p2, p3): """ complete nonsense """ return data1 + data2 * p1 * p2 * p3
data1 = np.random.rand(10) # the size of this arrays varies might be 10 might be 15 etc data2 = np.random.rand(10) # the size of this arrays varies might be 10 might be 15 etc obs = np.random.rand(10) # the size of this arrays varies might be 10 might be 15 etc
grid_size = 10 # Ideally this would be a large number param1 = np.linspace(5.0, 350, grid_size) param2 = np.linspace(5.0, 550, grid_size) param3 = np.linspace(1E-8, 10.5, grid_size) ss = np.zeros(0)
for p1 in param1: for p2 in param2: for p3 in param3: ans = fake_model(data1, data2, p1, p2, p3)
ss = np.append(ss, np.sum(obs - ans)**2) print np.sum(obs - ans)**2
Hi, On Mon, Aug 20, 2012 at 2:47 AM, Martin De Kauwe <mdekauwe@gmail.com> wrote:
Perhaps simplifying, in 2D this is what I want if using loops
def fake_model(data1, data2, p1, p2): """ complete nonsense """ return data1 + data2 * p1 * p2
grid_size = 5 nobs = 5 obs = np.zeros(nobs) data1 = np.arange(nobs) data2 = np.arange(nobs) a = np.arange(grid_size) b = np.arange(grid_size) c = np.arange(grid_size) ss = np.zeros(0) for p1 in a: for p2 in b: ans = fake_model(data1, data2, p1, p2) #ss = np.append(ss, np.sum(obs - ans)**2) print ans
Your script with a slightly modified data and parameters, will produce [10 13 16 19 22] [15 19 23 27 31] [20 25 30 35 40] snip [100 121 142 163 184] [125 151 177 203 229] [150 181 212 243 274]
which is equivalent to: In []: p1, p2= ix_(a, b) In []: n_= newaxis In []: ans= fake_model(data1[:, n_, n_], data2[:, n_, n_], p1, p2) In []: ans.reshape(-1, grid_size** 2).T Out[]: array([[ 10, 13, 16, 19, 22], [ 15, 19, 23, 27, 31], [ 20, 25, 30, 35, 40], snip [100, 121, 142, 163, 184], [125, 151, 177, 203, 229], [150, 181, 212, 243, 274]]) Regards, -eat
which would produce
[0 1 2 3 4] [0 1 2 3 4] [0 1 2 3 4] [0 1 2 3 4] [0 1 2 3 4] [0 1 2 3 4] [0 2 4 6 8] . snip . [0 1 2 3 4] [ 0 5 10 15 20] [ 0 9 18 27 36] [ 0 13 26 39 52] [ 0 17 34 51 68]
And so I figured something like...
a = np.ones((grid_size,grid_size)) * np.arange(grid_size)[None,:] b = np.ones((grid_size,grid_size)) * np.arange(grid_size)[:,None] ans = fake_model(data1, data2, a, b)
Although this doesn't seem to work, but I think this might be along the right lines? This produces
[[ 0. 1. 2. 3. 4.] [ 0. 2. 6. 12. 20.] [ 0. 3. 10. 21. 36.] [ 0. 4. 14. 30. 52.] [ 0. 5. 18. 39. 68.]]
On Sunday, August 19, 2012 7:07:59 PM UTC+10, Martin De Kauwe wrote:
Hi,
I need to avoid (at least) two inner for loops in what I am trying to do otherwise my processing takes forever. What is the best way to transfer what I am doing into a more "numpy way"? Essentially I am trying to call a model again for various different parameter combinations. The example is fictional, by the grid_size would ideally grow > 500 and by doing so the processing speed becomes very slow the way I have set things up..
thanks.
example.
import numpy as np
def fake_model(data1, data2, p1, p2, p3): """ complete nonsense """ return data1 + data2 * p1 * p2 * p3
data1 = np.random.rand(10) # the size of this arrays varies might be 10 might be 15 etc data2 = np.random.rand(10) # the size of this arrays varies might be 10 might be 15 etc obs = np.random.rand(10) # the size of this arrays varies might be 10 might be 15 etc
grid_size = 10 # Ideally this would be a large number param1 = np.linspace(5.0, 350, grid_size) param2 = np.linspace(5.0, 550, grid_size) param3 = np.linspace(1E-8, 10.5, grid_size) ss = np.zeros(0)
for p1 in param1: for p2 in param2: for p3 in param3: ans = fake_model(data1, data2, p1, p2, p3)
ss = np.append(ss, np.sum(obs - ans)**2) print np.sum(obs - ans)**2
_______________________________________________ SciPy-User mailing list SciPy-User@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user
This produces the same result like using loops """ params = param1*param2[:, None] fake_model = data1+data2*params1.reshape(-1)[:, None] ans = np.reshape( fake_model, (5,5,5) ) print ans[0] array([[0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4]]) print ans[-1] array([[ 0, 1, 2, 3, 4], [ 0, 5, 10, 15, 20], [ 0, 9, 18, 27, 36], [ 0, 13, 26, 39, 52], [ 0, 17, 34, 51, 68]]) Issa On 20 August 2012 00:47, Martin De Kauwe <mdekauwe@gmail.com> wrote:
Perhaps simplifying, in 2D this is what I want if using loops
def fake_model(data1, data2, p1, p2): """ complete nonsense """ return data1 + data2 * p1 * p2
grid_size = 5 nobs = 5 obs = np.zeros(nobs) data1 = np.arange(nobs) data2 = np.arange(nobs) a = np.arange(grid_size) b = np.arange(grid_size) c = np.arange(grid_size) ss = np.zeros(0) for p1 in a: for p2 in b: ans = fake_model(data1, data2, p1, p2) #ss = np.append(ss, np.sum(obs - ans)**2) print ans
which would produce
[0 1 2 3 4] [0 1 2 3 4] [0 1 2 3 4] [0 1 2 3 4] [0 1 2 3 4] [0 1 2 3 4] [0 2 4 6 8] . snip . [0 1 2 3 4] [ 0 5 10 15 20] [ 0 9 18 27 36] [ 0 13 26 39 52] [ 0 17 34 51 68]
And so I figured something like...
a = np.ones((grid_size,grid_size)) * np.arange(grid_size)[None,:] b = np.ones((grid_size,grid_size)) * np.arange(grid_size)[:,None] ans = fake_model(data1, data2, a, b)
Although this doesn't seem to work, but I think this might be along the right lines? This produces
[[ 0. 1. 2. 3. 4.] [ 0. 2. 6. 12. 20.] [ 0. 3. 10. 21. 36.] [ 0. 4. 14. 30. 52.] [ 0. 5. 18. 39. 68.]]
On Sunday, August 19, 2012 7:07:59 PM UTC+10, Martin De Kauwe wrote:
Hi,
I need to avoid (at least) two inner for loops in what I am trying to do otherwise my processing takes forever. What is the best way to transfer what I am doing into a more "numpy way"? Essentially I am trying to call a model again for various different parameter combinations. The example is fictional, by the grid_size would ideally grow > 500 and by doing so the processing speed becomes very slow the way I have set things up..
thanks.
example.
import numpy as np
def fake_model(data1, data2, p1, p2, p3): """ complete nonsense """ return data1 + data2 * p1 * p2 * p3
data1 = np.random.rand(10) # the size of this arrays varies might be 10 might be 15 etc data2 = np.random.rand(10) # the size of this arrays varies might be 10 might be 15 etc obs = np.random.rand(10) # the size of this arrays varies might be 10 might be 15 etc
grid_size = 10 # Ideally this would be a large number param1 = np.linspace(5.0, 350, grid_size) param2 = np.linspace(5.0, 550, grid_size) param3 = np.linspace(1E-8, 10.5, grid_size) ss = np.zeros(0)
for p1 in param1: for p2 in param2: for p3 in param3: ans = fake_model(data1, data2, p1, p2, p3)
ss = np.append(ss, np.sum(obs - ans)**2) print np.sum(obs - ans)**2
_______________________________________________ SciPy-User mailing list SciPy-User@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user
Or perhaps an easier solution would be to build the sampling grid first and then use a single for loop to run the model? import itertools import numpy as np grid_size = 500 a = np.linspace(5.0, 350, grid_size) b = np.linspace(5.0, 550, grid_size) c = np.linspace(1E-8, 10.5, grid_size) x = [] for (i,j,k) in itertools.product(a, b, c): x.extend((i,j,k)) This would achieve what I want but is again very slow, so is there a way to jump over the need for the two inner loops? thanks
perhaps you can consider 'np.ogrid' for the product. Ps: Is this code not helpful? param1 = np.arange(5) param2 = param1 params1 = param1*param2[:, None] fake_model = data1+data2*params1.reshape(- 1)[:, None] ans = np.reshape( fake_model, (5,5,5) ) print ans[0] array([[0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4]]) print ans[1] array([[ 0, 1, 2, 3, 4], [ 0, 2, 4, 6, 8], [ 0, 3, 6, 9, 12], [ 0, 4, 8, 12, 16], [ 0, 5, 10, 15, 20]]) print ans[-1] array([[ 0, 1, 2, 3, 4], [ 0, 5, 10, 15, 20], [ 0, 9, 18, 27, 36], [ 0, 13, 26, 39, 52], [ 0, 17, 34, 51, 68]]) On 21 August 2012 02:11, Martin De Kauwe <mdekauwe@gmail.com> wrote:
Or perhaps an easier solution would be to build the sampling grid first and then use a single for loop to run the model?
import itertools import numpy as np
grid_size = 500 a = np.linspace(5.0, 350, grid_size) b = np.linspace(5.0, 550, grid_size) c = np.linspace(1E-8, 10.5, grid_size)
x = [] for (i,j,k) in itertools.product(a, b, c): x.extend((i,j,k))
This would achieve what I want but is again very slow, so is there a way to jump over the need for the two inner loops?
thanks
_______________________________________________ SciPy-User mailing list SciPy-User@scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user
Consider using broadcasting to three 3D parameters matrices (A,B,C), then creating a 'ufunc' that takes three parameter at a time (a,b,c) for every identical position in the three arrays (A,B,C). So giving the broadcasting arrays as input to a ufunction which maps the model function on the three paramater arrays. greets, Luuk On Sunday, August 19, 2012 11:07:59 AM UTC+2, Martin De Kauwe wrote:
Hi,
I need to avoid (at least) two inner for loops in what I am trying to do otherwise my processing takes forever. What is the best way to transfer what I am doing into a more "numpy way"? Essentially I am trying to call a model again for various different parameter combinations. The example is fictional, by the grid_size would ideally grow > 500 and by doing so the processing speed becomes very slow the way I have set things up..
thanks.
example.
import numpy as np
def fake_model(data1, data2, p1, p2, p3): """ complete nonsense """ return data1 + data2 * p1 * p2 * p3
data1 = np.random.rand(10) # the size of this arrays varies might be 10 might be 15 etc data2 = np.random.rand(10) # the size of this arrays varies might be 10 might be 15 etc obs = np.random.rand(10) # the size of this arrays varies might be 10 might be 15 etc
grid_size = 10 # Ideally this would be a large number param1 = np.linspace(5.0, 350, grid_size) param2 = np.linspace(5.0, 550, grid_size) param3 = np.linspace(1E-8, 10.5, grid_size) ss = np.zeros(0)
for p1 in param1: for p2 in param2: for p3 in param3: ans = fake_model(data1, data2, p1, p2, p3)
ss = np.append(ss, np.sum(obs - ans)**2) print np.sum(obs - ans)**2
participants (6)
-
eat -
Issa Karambal -
J. David Lee -
Luuk van der Velden -
Martin De Kauwe -
mdekauwe