Program very slow to finish

Roeland Rengelink r.b.rigilink at chello.nl
Sun Nov 4 11:47:44 CET 2001


Hi Fred,

I'm pretty sure this is a problem with the garbage collector, I did some
measurements showing bad (O(n**2)) performance when collecting large
numbers of objects at once (as happens when your dicts are deleted at
the end of your program). I filed a bug report (#477963) on it. See:

http://sourceforge.net/tracker/index.php?func=detail&aid=477963&group_id=5470&atid=105470

for more details

Roeland

Fred wrote:
> 
> I'm dealing with about 100Gb of data that I first just need to
> characterize.  So, since the slowest part will be simply reading the
> data, I'm testing various languages and methods on a 10 line subset, a
> 32 Mb subset and a 750 Mb subset.
> 
> The following python program prints out the results in about 30 seconds,
> however it doesn't finish for another minute with the 32 Mb set of
> data!  At first I thought it was stuck and killed it; however I finally
> let it run on the smaller data set and all was well.
> 
> Is this a garbage collection issue?  Is there a better way to count the
> individual values than dictionaries?  I put the sys.exit call in while
> trying to figure out what was happening but it didn't make a difference.
> 

[snip code]

-- 
r.b.rigilink at chello.nl

"Half of what I say is nonsense. Unfortunately I don't know which half"



More information about the Python-list mailing list