[Python-Dev] extremely slow exit for program having huge (45G) dict (python 2.5.2)

Leif Walsh leif.walsh at gmail.com
Sat Dec 20 22:01:59 CET 2008

(@Skip, Michael, Tim)

On Sat, Dec 20, 2008 at 3:26 PM,  <skip at pobox.com> wrote:
> Because useful side effects are sometimes performed as a result of this
> activity (flushing disk buffers, closing database connections, etc).

Of course they are.  But what about the case given above:

On Sat, Dec 20, 2008 at 5:55 AM, Steven D'Aprano <steve at pearwood.info> wrote:
> I was originally one of the skeptics until I reproduced the original
> posters problem. I generated a sample file 8 million key/value pairs as
> a 370MB text file. Reading it into a dict took two and a half minutes
> on my relatively slow computer. But deleting the dict took more than 30
> minutes even with garbage collection switched off.

It might be a semantic change that I'm looking for here, but it seems
to me that if you turn off the garbage collector, you should be able
to expect that either it also won't run on exit, or it should have a
way of letting you tell it not to run on exit.  If I'm running without
a garbage collector, that assumes I'm at least cocky enough to think I
know when I'm done with my objects, so I should know to delete the
objects that have __del__ functions I care about before I exit.  Well,
maybe; I'm sure one of you could drag out a programmer that would make
that mistake, but turning off the garbage collector to me seems to
send the experience message, at least a little.

Does the garbage collector run any differently when the process is
exiting?  It seems that it wouldn't need to do anything more that run
through all objects in the heap and delete them, which doesn't require
anything fancy, and should be able to sort by address to aid with
caching.  If it's already this fast, then I guess it really is the
sheer number of function calls necessary that are causing such a
slowdown in the cases we've seen, but I find this hard to believe.


More information about the Python-Dev mailing list