# [Tutor] Any suggestions for optimizing this code?

grouchy grouch at gmail.com
Mon Sep 19 06:43:47 CEST 2005

```Hello again!

I've been playing with generating 1-D cellular automata, and this is the
fastest solution I've come up with so far.

In the makeArray() function below, I have a sliding window of three elements
starting at row[-1,0,1] and going to row[-2,-1,0], so that it wraps around
at either boundary. I use the results of 4*row[i-1] + row[i] + row[i+1] to
convert the three bits to an integer, and fetch a result from rule[index].
The inner loop is simple, and executes a million times, so shaving any time
off makes a big difference.

The biggest speedup from things I tried came from making an empty matrix
first, to put results into the next row by index, instead of creating new
rows on the fly with append. And binding the result[row] and result[row+1]
references before the loop. Those two things sped it up by a consistentt
40%. Thanks to Python in a Nutshell!

Does anybody see a faster approach, or a way to optimize the inner loop on
makeArray() further? With the current approach, the inner loop does the real
work, as it steps over each cell one at a time.

For anybody not familiar with cellular automatas, there is a pretty java
animation of how cells get calculated here:
http://math.hws.edu/xJava/CA/CA.html

And this is what CAs they look like when you plot them, with 1s as black and
0s as white, and some additional info:
http://mathworld.wolfram.com/Rule30.html

I've just hardcoded rule 30 into this for testing. I make a blank matrix
with makeMatrix(). Then makeArray(matrix) sets the middle element of the
first row in the matrix to 1 to seed the CA, and loops through it two rows
at a time, calculating the results of the first row, and putting them into
the next row.

Right now I get:

>>> timeArray(5)
Total time: 3.95693028

I am uneasy with the algorithm getting each element one at a time, throwing
them away, and getting two that overlap in the next step of the window, but
I couldn't come up with a more elegant solution. Also, it seems like a
kludgey way to convert the 3 bits into a binary number, but it was the
fastest way I stumbled on to.

w = 1000
h = 1000
rule = [0,1,1,1,1,0,0,0]

def makeMatrix(w,h):
result = *h
for i in range(h):
result[i] = *w
return result

def makeArray(matrix):
result = matrix
result[w/2] = 1

for row in range(h-1):
last = result[row]
next = result[row+1]
for i in range(w-1):
next[i] = rule[4*last[i-1]+2*last[i]+last[i+1]]
next[i+1] = rule[4*last[i]+2*last[i+1]+last]
return result

def timeArray(num):
import time
matrix = makeMatrix(w,h)
t1 = time.clock()
for i in range(num):
makeArray(matrix)
t2 = time.clock()
print "Total time:", t2 - t1

timeArray(5)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/tutor/attachments/20050918/bcf358ac/attachment.htm
```