[Tutor] Any suggestions for optimizing this code?

Mon Sep 19 06:43:47 CEST 2005

Hello again!

I've been playing with generating 1-D cellular automata, and this is the 
fastest solution I've come up with so far. 

In the makeArray() function below, I have a sliding window of three elements 
starting at row[-1,0,1] and going to row[-2,-1,0], so that it wraps around 
at either boundary. I use the results of 4*row[i-1] + row[i] + row[i+1] to 
convert the three bits to an integer, and fetch a result from rule[index]. 
The inner loop is simple, and executes a million times, so shaving any time 
off makes a big difference. 

The biggest speedup from things I tried came from making an empty matrix 
first, to put results into the next row by index, instead of creating new 
rows on the fly with append. And binding the result[row] and result[row+1] 
references before the loop. Those two things sped it up by a consistentt 
40%. Thanks to Python in a Nutshell!

Does anybody see a faster approach, or a way to optimize the inner loop on 
makeArray() further? With the current approach, the inner loop does the real 
work, as it steps over each cell one at a time.

For anybody not familiar with cellular automatas, there is a pretty java 
animation of how cells get calculated here: 
http://math.hws.edu/xJava/CA/CA.html

And this is what CAs they look like when you plot them, with 1s as black and 
0s as white, and some additional info: 
http://mathworld.wolfram.com/Rule30.html

I've just hardcoded rule 30 into this for testing. I make a blank matrix 
with makeMatrix(). Then makeArray(matrix) sets the middle element of the 
first row in the matrix to 1 to seed the CA, and loops through it two rows 
at a time, calculating the results of the first row, and putting them into 
the next row. 

Right now I get:

>>> timeArray(5)
Total time: 3.95693028

I am uneasy with the algorithm getting each element one at a time, throwing 
them away, and getting two that overlap in the next step of the window, but 
I couldn't come up with a more elegant solution. Also, it seems like a 
kludgey way to convert the 3 bits into a binary number, but it was the 
fastest way I stumbled on to.

w = 1000
h = 1000
rule = [0,1,1,1,1,0,0,0]

def makeMatrix(w,h):
result = [0]*h
for i in range(h):
result[i] = [0]*w
return result

def makeArray(matrix):
result = matrix 
result[0][w/2] = 1

for row in range(h-1):
last = result[row]
next = result[row+1]
for i in range(w-1):
next[i] = rule[4*last[i-1]+2*last[i]+last[i+1]]
next[i+1] = rule[4*last[i]+2*last[i+1]+last[0]]
return result

def timeArray(num):
import time
matrix = makeMatrix(w,h)
t1 = time.clock()
for i in range(num):
makeArray(matrix) 
t2 = time.clock()
print "Total time:", t2 - t1

timeArray(5)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/tutor/attachments/20050918/bcf358ac/attachment.htm