Anyone happen to have optimization hints for this loop?

Diez B. Roggisch deets at nospam.web.de
Wed Jul 9 11:48:00 EDT 2008


dp_pearce wrote:

> I have some code that takes data from an Access database and processes
> it into text files for another application. At the moment, I am using
> a number of loops that are pretty slow. I am not a hugely experienced
> python user so I would like to know if I am doing anything
> particularly wrong or that can be hugely improved through the use of
> another method.
> 
> Currently, all of the values that are to be written to file are pulled
> from the database and into a list called "domainVa". These values
> represent 3D data and need to be written to text files using line
> breaks to seperate 'layers'. I am currently looping through the list
> and appending a string, which I then write to file. This list can
> regularly contain upwards of half a million values...
> 
> count = 0
> dmntString = ""
> for z in range(0, Z):
>     for y in range(0, Y):
>         for x in range(0, X):
>             fraction = domainVa[count]
>             dmntString += "  "
>             dmntString += fraction
>             count = count + 1
>         dmntString += "\n"
>     dmntString += "\n"
> dmntString += "\n***\n
> 
> dmntFile     = open(dmntFilename, 'wt')
> dmntFile.write(dmntString)
> dmntFile.close()
> 
> I have found that it is currently taking ~3 seconds to build the
> string but ~1 second to write the string to file, which seems wrong (I
> would normally guess the CPU/Memory would out perform disc writing
> speeds).
> 
> Can anyone see a way of speeding this loop up? Perhaps by changing the
> data format? Is it wrong to append a string and write once, or should
> hold a file open and write at each instance?

Don't use in-place adding to concatenate strings. It might lead to
quadaratic behavior.

Use the "".join()-idiom instead:

dmntStrings = []
....
    dmntStrings.append("\n")
....
dmntFile.write("".join(dmntStrings))

Diez



More information about the Python-list mailing list