On 2/13/07, Mark Janikas <mjanikas@esri.com> wrote:
Good call Stefan,
I decoupled the timing from the application (duh!) and got better results:
from numpy import *
import numpy.random as RAND
import time as TIME
x = RAND.random(1000)
xl = x.tolist
()
t1 = TIME.clock()
xStringOut = [ str(i) for i in xl ]
xStringOut = " ".join(xStringOut)
f = file('blah.dat','w'); f.write(xStringOut)
t2 = TIME.clock()
total = t2 - t1
t1 = TIME.clock()
f = file('blah.bwt','wb')
xBinaryOut = x.tostring()
f.write(xBinaryOut)
t2 = TIME.clock()
total1 = t2 - t1
>>> total
0.00661
>>> total1
0.00229
Printing x directly to a string took REALLY long: f.write(str(x)) = 0.0258
The problem therefore, must be in the way I am appending values to the empty arrays. I am currently using the append method:
myArray = append(myArray, newValue)
Or would it be faster to concat or use a list append then convert?
I am going to guess that a list would be faster for appending. Concat and, I suspect, append make new arrays for each use, rather like string concatenation in Python. A list, on the other hand, is no doubt optimized for adding new values. Another option might be using PyTables with extensible arrays. In any case, a bit of timing should show the way if the performance is that crucial to your application.
Chuck