[Python-Dev] extremely slow exit for program having huge (45G) dict (python 2.5.2)

Tim Peters tim.peters at gmail.com
Sat Dec 20 22:11:30 CET 2008


[Leif Walsh]
> ...
> It might be a semantic change that I'm looking for here, but it seems
> to me that if you turn off the garbage collector, you should be able
> to expect that either it also won't run on exit,

It won't then, but "the garbage collector" is the gc module, and that
only performs /cyclic/ garbage collection.  There is no way to stop
refcount-based garbage collection.  Read my message again.


> or it should have a
> way of letting you tell it not to run on exit.  If I'm running without
> a garbage collector, that assumes I'm at least cocky enough to think I
> know when I'm done with my objects, so I should know to delete the
> objects that have __del__ functions I care about before I exit.  Well,
> maybe; I'm sure one of you could drag out a programmer that would make
> that mistake, but turning off the garbage collector to me seems to
> send the experience message, at least a little.

This probably isn't a problem with cyclic gc (reread my msg).


> Does the garbage collector run any differently when the process is
> exiting?

No.


> It seems that it wouldn't need to do anything more that run
> through all objects in the heap and delete them, which doesn't require
> anything fancy,

Reread my msg -- already explained the likely cause here (if "all the
objects in the heap" have in fact been swapped out to disk, it can
take an enormously long time to just "run through" them all).


> and should be able to sort by address to aid with
> caching.

That one isn't possible.  There is no list of "all objects" to /be/
sorted.  The only way to find all the objects is to traverse the
object graph from its roots, which is exactly what non-cyclic gc does
anyway.


>  If it's already this fast, then I guess it really is the
> sheer number of function calls necessary that are causing such a
> slowdown in the cases we've seen, but I find this hard to believe.

My guess remains that CPU usage is trivial here, and 99.99+% of the
wall-clock time is consumed waiting for disk reads.  Either that, or
that platform malloc is going nuts.


More information about the Python-Dev mailing list