Possible memory leak?

Steven D'Aprano steve at REMOVETHIScyber.com.au
Tue Jan 24 17:11:01 EST 2006


On Tue, 24 Jan 2006 13:51:36 -0800, Tuvas wrote:

> The purpose of this part of a program is to take a 14 bit numerical
> representation and convert it to an 8 bit representation. This will
> later be displayed as an image. However, I've noticed the following
> about this code. I was noticing when I took a small picture, it went
> really fast, but a larger picture took forever to run through. 

for i in range(10):
    for j in range(10):
       # do stuff with i and j
       # this loops 100 times.

for i in range(1000):
    for j in range(1000):
       # do stuff with i and j
       # this loops 1,000,000.

Of course the code gets slower as the image size increases.

But the real killer is this one line:

row=row+chr(num/64)

Bad, bad BAD idea. Every time you add two strings together, Python has to
copy BOTH strings. As row gets huge, this takes longer and longer to do.

A rule of thumb I use is, never add more than two strings together. Maybe
three. Certainly not more than four. Or five. 

But absolutely not millions of strings, which is what you are doing.

The best way to handle this is to turn row into a list, and then once, at
the very end, convert the list into a string.

Instead of this:

row = ""
# processing in a loop...
row = row + chr(num/64)
# finished processing
print row

do this instead:

row = []
# processing in a loop...
row.append(chr(num/64)
# finished processing
row = "".join(row)  # convert to a string in one hit
print row

You should find that runs much faster.

> I suspect if I somehow dealocate the row statement
> after it's done, that it will run faster, and the data variable when
> it's done, but I just don't know how to do so.

Once row and data are no longer being used, Python automatically removes
them and reclaims the memory they used. You rarely need to worry about
that.


-- 
Steven.




More information about the Python-list mailing list