Hello, I'm taking a CFD class, one of the codes I wrote runs very slow. When I look at hotshot is says the function below is the problem. Since this is an explicit step, the for loops are only traversed once, so I think it's caused by memory usage, but I'm not sure if it's the local variables or the loop? I can vectorize the inner loop, would declaring the data structures in the calling routine and passing them in be a better idea than using local storage? I'm new at python and numpy, I need to look at how to get profiling information for the lines within a function. Thank you, Frank PS I tried to post this via google groups, but it didn't seem to go through, sorry if it ends up as multiple postings def findw(wnext,wprior,phiprior,uprior,vprior): #format here is x[i,j] where i's are rows, j's columns, use flipud() to get the #print out consistent with the spacial up-down directions #assign local names that are more #inline with the class notation w = wprior phi = phiprior u = uprior v = vprior #three of the BC are known so just set them #symetry plane wnext[0,0:gcols] = 0.0 #upper wall wnext[gN,0:gcols] = 2.0/gdy**2 * (phi[gN,0:gcols] - phi[gN-1,0:gcols]) #inlet, off the walls wnext[1:grows-1,0] = 0.0 upos = where(u>0) vpos = where(v>0) Sx = ones_like(u) Sx[upos] = 0.0 Sy = ones_like(v) Sy[vpos] = 0.0 uw = u*w vw = v*w #interior nodes for j in range(1,gasizej-1): for i in range(1,gasizei-1): wnext[i,j] =( w[i,j] + gnu*gdt/gdx**2 * (w[i,j-1] - 2.0*w[i,j] + w[i,j+1]) + gnu*gdt/gdy**2 * (w[i-1,j] - 2.0*w[i,j] + w[i+1,j]) - (1.0 - Sx[i,j]) * gdt/gdx * (uw[i,j] - uw[i,j-1]) - Sx[i,j] * gdt/gdx * (uw[i,j+1] - uw[i,j]) - (1.0 - Sy[i,j]) * gdt/gdy * (vw[i,j] - vw[i-1,j]) - Sy[i,j] * gdt/gdy * (vw[i+1,j] - vw[i,j]) ) ## print "***wnext****" ## print "i: ", i, "j: ", j, "wnext[i,j]: ", wnext[i,j] #final BC at outlet, off walls wnext[1:grows-1,gM] = wnext[1:grows-1,gM-1]
fsenkel@verizon.net wrote:
Hello,
I'm taking a CFD class, one of the codes I wrote runs very slow. When I look at hotshot is says the function below is the problem. Since this is an explicit step, the for loops are only traversed once, so I think it's caused by memory usage, but I'm not sure if it's the local variables or the loop? I can vectorize the inner loop, would declaring the data structures in the calling routine and passing them in be a better idea than using local storage?
I'm new at python and numpy, I need to look at how to get profiling information for the lines within a function.
Thank you,
Frank
PS I tried to post this via google groups, but it didn't seem to go through, sorry if it ends up as multiple postings
def findw(wnext,wprior,phiprior,uprior,vprior): #format here is x[i,j] where i's are rows, j's columns, use flipud() to get the #print out consistent with the spacial up-down directions
#assign local names that are more #inline with the class notation w = wprior phi = phiprior u = uprior v = vprior
#three of the BC are known so just set them #symetry plane wnext[0,0:gcols] = 0.0
#upper wall wnext[gN,0:gcols] = 2.0/gdy**2 * (phi[gN,0:gcols] - phi[gN-1,0:gcols])
#inlet, off the walls wnext[1:grows-1,0] = 0.0
upos = where(u>0) vpos = where(v>0)
Sx = ones_like(u) Sx[upos] = 0.0
Sy = ones_like(v) Sy[vpos] = 0.0
uw = u*w vw = v*w
#interior nodes for j in range(1,gasizej-1): for i in range(1,gasizei-1):
wnext[i,j] =( w[i,j] + gnu*gdt/gdx**2 * (w[i,j-1] - 2.0*w[i,j] + w[i,j+1]) + gnu*gdt/gdy**2 * (w[i-1,j] - 2.0*w[i,j] + w[i+1,j]) - (1.0 - Sx[i,j]) * gdt/gdx * (uw[i,j] - uw[i,j-1]) - Sx[i,j] * gdt/gdx * (uw[i,j+1] - uw[i,j]) - (1.0 - Sy[i,j]) * gdt/gdy * (vw[i,j] - vw[i-1,j]) - Sy[i,j] * gdt/gdy * (vw[i+1,j] - vw[i,j]) )
I imagine that this loop is what is killing you. Remove at least the inner loop, if not both (try removing just the inner loop as well as both since sometimes it's faster to just remove the inner one due to memory usage issues..) removing both will look something like: wnext[1:-1,1:-1] = w[1:-1, 1:-1] + gnu*gdx**2* (w[1:-1, 0:-2] - 2.0*w[1:-1, 1:-1] + w[1:-1,2:] etc, etc When you're done with that, note also that you have the same array term present multiple times. You could save more time by collapsing those terms and using different scalar multipliers. Occasionally that is numerically unwise, but I doubt it in this case. There all sorts of other things that you can do such as using inplace operations, etc. But try vectorizing the loop first. -tim
## print "***wnext****" ## print "i: ", i, "j: ", j, "wnext[i,j]: ", wnext[i,j]
#final BC at outlet, off walls wnext[1:grows-1,gM] = wnext[1:grows-1,gM-1] _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
On 12/3/06, fsenkel@verizon.net <fsenkel@verizon.net> wrote:
Hello,
I'm taking a CFD class, one of the codes I wrote runs very slow. When I look at hotshot is says the function below is the problem. Since this is an explicit step, the for loops are only traversed once, so I think it's caused by memory usage, but I'm not sure if it's the local variables or the loop? I can vectorize the inner loop, would declaring the data structures in the calling routine and passing them in be a better idea than using local storage?
I'm new at python and numpy, I need to look at how to get profiling information for the lines within a function.
Thank you,
Frank
PS I tried to post this via google groups, but it didn't seem to go through, sorry if it ends up as multiple postings
def findw(wnext,wprior,phiprior,uprior,vprior): #format here is x[i,j] where i's are rows, j's columns, use flipud() to get the #print out consistent with the spacial up-down directions
#assign local names that are more #inline with the class notation w = wprior phi = phiprior u = uprior v = vprior
#three of the BC are known so just set them #symetry plane wnext[0,0:gcols] = 0.0
#upper wall wnext[gN,0:gcols] = 2.0/gdy**2 * (phi[gN,0:gcols] - phi[gN-1,0:gcols])
#inlet, off the walls wnext[1:grows-1,0] = 0.0
upos = where(u>0) vpos = where(v>0)
Sx = ones_like(u) Sx[upos] = 0.0
Sy = ones_like(v) Sy[vpos] = 0.0
uw = u*w vw = v*w
#interior nodes for j in range(1,gasizej-1): for i in range(1,gasizei-1):
wnext[i,j] =( w[i,j] + gnu*gdt/gdx**2 * (w[i,j-1] - 2.0*w[i,j] + w[i,j+1]) + gnu*gdt/gdy**2 * (w[i-1,j] - 2.0*w[i,j] + w[i+1,j]) - (1.0 - Sx[i,j]) * gdt/gdx * (uw[i,j] - uw[i,j-1]) - Sx[i,j] * gdt/gdx * (uw[i,j+1] - uw[i,j]) - (1.0 - Sy[i,j]) * gdt/gdy * (vw[i,j] - vw[i-1,j]) - Sy[i,j] * gdt/gdy * (vw[i+1,j] - vw[i,j]) )
## print "***wnext****" ## print "i: ", i, "j: ", j, "wnext[i,j]: ", wnext[i,j]
#final BC at outlet, off walls wnext[1:grows-1,gM] = wnext[1:grows-1,gM-1] _
Explicit indexing tends to be very slow. I note what looks to be a lot of differencing in the code, so I suspect what you have here is a PDE. Your best bet in the short term is to vectorize as many of these operations as possible, but because the expression is so complicated it is a bit of a chore to see just how. It your CFD class allows it, there are probably tools in scipy that are adapted to this sort of problem, and in particular to CFD. Sandia also puts out PyTrilinos, http://software.sandia.gov/trilinos/packages/pytrilinos/, which provides interfaces to distributed and parallel PDE solvers. It's big iron software for serious problems, so might be a bit of overkill for your applications. Chuck
On 12/5/06, Charles R Harris <charlesr.harris@gmail.com> wrote:
On 12/3/06, fsenkel@verizon.net <fsenkel@verizon.net> wrote:
Hello,
I'm taking a CFD class, one of the codes I wrote runs very slow. When I look at hotshot is says the function below is the problem. Since this is an explicit step, the for loops are only traversed once, so I think it's caused by memory usage, but I'm not sure if it's the local variables or the loop? I can vectorize the inner loop, would declaring the data structures in the calling routine and passing them in be a better idea than using local storage?
I'm new at python and numpy, I need to look at how to get profiling information for the lines within a function.
Thank you,
Frank
PS I tried to post this via google groups, but it didn't seem to go through, sorry if it ends up as multiple postings
<snip>
Explicit indexing tends to be very slow. I note what looks to be a lot of differencing in the code, so I suspect what you have here is a PDE. Your best bet in the short term is to vectorize as many of these operations as possible, but because the expression is so complicated it is a bit of a chore to see just how. It your CFD class allows it, there are probably tools in scipy that are adapted to this sort of problem, and in particular to CFD. Sandia also puts out PyTrilinos, http://software.sandia.gov/trilinos/packages/pytrilinos/, which provides interfaces to distributed and parallel PDE solvers. It's big iron software for serious problems, so might be a bit of overkill for your applications.
If it is a PDE, you might also want to look into sparse matrices. Other folks here can tell you more about that. Chuck
it looks like you could use weave.blitz() without much change to your code. or weave.inline() if needed. see this page: http://scipy.org/PerformancePython On 12/3/06, fsenkel@verizon.net <fsenkel@verizon.net> wrote:
Hello,
I'm taking a CFD class, one of the codes I wrote runs very slow. When I look at hotshot is says the function below is the problem. Since this is an explicit step, the for loops are only traversed once, so I think it's caused by memory usage, but I'm not sure if it's the local variables or the loop? I can vectorize the inner loop, would declaring the data structures in the calling routine and passing them in be a better idea than using local storage?
I'm new at python and numpy, I need to look at how to get profiling information for the lines within a function.
Thank you,
Frank
PS I tried to post this via google groups, but it didn't seem to go through, sorry if it ends up as multiple postings
def findw(wnext,wprior,phiprior,uprior,vprior): #format here is x[i,j] where i's are rows, j's columns, use flipud() to get the #print out consistent with the spacial up-down directions
#assign local names that are more #inline with the class notation w = wprior phi = phiprior u = uprior v = vprior
#three of the BC are known so just set them #symetry plane wnext[0,0:gcols] = 0.0
#upper wall wnext[gN,0:gcols] = 2.0/gdy**2 * (phi[gN,0:gcols] - phi[gN-1,0:gcols])
#inlet, off the walls wnext[1:grows-1,0] = 0.0
upos = where(u>0) vpos = where(v>0)
Sx = ones_like(u) Sx[upos] = 0.0
Sy = ones_like(v) Sy[vpos] = 0.0
uw = u*w vw = v*w
#interior nodes for j in range(1,gasizej-1): for i in range(1,gasizei-1):
wnext[i,j] =( w[i,j] + gnu*gdt/gdx**2 * (w[i,j-1] - 2.0*w[i,j] + w[i,j+1]) + gnu*gdt/gdy**2 * (w[i-1,j] - 2.0*w[i,j] + w[i+1,j]) - (1.0 - Sx[i,j]) * gdt/gdx * (uw[i,j] - uw[i,j-1]) - Sx[i,j] * gdt/gdx * (uw[i,j+1] - uw[i,j]) - (1.0 - Sy[i,j]) * gdt/gdy * (vw[i,j] - vw[i-1,j]) - Sy[i,j] * gdt/gdy * (vw[i+1,j] - vw[i,j]) )
## print "***wnext****" ## print "i: ", i, "j: ", j, "wnext[i,j]: ", wnext[i,j]
#final BC at outlet, off walls wnext[1:grows-1,gM] = wnext[1:grows-1,gM-1] _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
participants (4)
-
Brent Pedersen
-
Charles R Harris
-
fsenkel@verizon.net
-
Tim Hochberg