Program very slow to finish

Paul Rubin phr-n2001d at nightsong.com
Sun Nov 4 16:38:44 EST 2001


Fred <fredNo at nospamco.com> writes:
>     It does look like a bug.  os.exit(0) ends the program right after the
> print statements, taking about 1/3 of the time.
> 
>     I see another respondent to this thread has done more tests and
> has opened a bug report.

Yes, that was interesting.  I hope it gets fixed.

> > Also, counting the entries with len(nacc.keys()) is pretty horrendous
> > if the number of entries is very large (several million).  For 120,000
> > it's probably tolerable.
> >
> > If you expect to get a much larger number of distinct values (like
> > 10's or 100's of million distinct) in your 100 GB data set, you
> > probably don't want to use Python dictionaries to remember them.  The
> > overhead per value is rather large, like dozens of bytes.
> >
> 
>     Well, it looks like I'm going to top out at about 600K values for the
> three variables; so far the times look pretty good.  A first test had python
> beating compiled fortran, SAS and perl.

600K values should be tolerable on a largish PC these days (e.g. 256M
or more of memory).  Memory is so cheap these days that it's probably
easier to add a GB of ram if you need it, than to mess with fancy
algorithms.



More information about the Python-list mailing list