Anyone happen to have optimization hints for this loop?
doug.farrell at gmail.com
Wed Jul 9 18:47:32 CEST 2008
On Jul 9, 12:04 pm, dp_pearce <dp_pea... at hotmail.com> wrote:
> I have some code that takes data from an Access database and processes
> it into text files for another application. At the moment, I am using
> a number of loops that are pretty slow. I am not a hugely experienced
> python user so I would like to know if I am doing anything
> particularly wrong or that can be hugely improved through the use of
> another method.
> Currently, all of the values that are to be written to file are pulled
> from the database and into a list called "domainVa". These values
> represent 3D data and need to be written to text files using line
> breaks to seperate 'layers'. I am currently looping through the list
> and appending a string, which I then write to file. This list can
> regularly contain upwards of half a million values...
> count = 0
> dmntString = ""
> for z in range(0, Z):
> for y in range(0, Y):
> for x in range(0, X):
> fraction = domainVa[count]
> dmntString += " "
> dmntString += fraction
> count = count + 1
> dmntString += "\n"
> dmntString += "\n"
> dmntString += "\n***\n
> dmntFile = open(dmntFilename, 'wt')
> I have found that it is currently taking ~3 seconds to build the
> string but ~1 second to write the string to file, which seems wrong (I
> would normally guess the CPU/Memory would out perform disc writing
> Can anyone see a way of speeding this loop up? Perhaps by changing the
> data format? Is it wrong to append a string and write once, or should
> hold a file open and write at each instance?
> Thank you in advance for your time,
Looking at the code sample you sent, you could do some clever stuff
making dmntString a list rather than a string and appending everywhere
you're doing a +=. Then at the end you build the string your write to
the file one time with a dmntFile.write(''.join(dmntList). But I think
the more straightforward thing would be to replace all the dmntString
+= ... lines in the loops with a dmntFile.write(whatever), you're just
constantly adding onto the file in various ways.
I think the slowdown you're seeing your code as written comes from
Python string being immutable. Every time you perform a dmntString
+= ... in the loops you're creating a new dmntString, copying in the
contents of the old, plus the appended content. And if your list can
reach a half a million items, well that's a TON of string create,
string copy operations.
Hope you find this helpful,
More information about the Python-list