[Python-Dev] last minute GC questions

Jeremy Hylton jeremy@beopen.com
Fri, 30 Jun 2000 16:57:44 -0400 (EDT)


I've got two last minute questions.

Does it look to you like I checked in all of the changes that you and
Vladimir discussed?

Might we change the strategy for deciding when to collect?

There are two parts of the strategy that could probably change.  The
first is what kind of allocation events we count to determine when to
collect.  

Right now, the gc is counting the net effect of allocations and
deallocations.  This isn't effective for at least a couple of cases.
If we allocate N objects and don't deallocate anything, then no
garbage is going to be created.  If we have many objects currently
allocated and then dealloc N objects without allocating any, we could
create collectible garbage, but the collector won't run because there
haven't been any allocations.

It seems to me that counting deallocations only would be more
effective.  It is only the deallocations that cause a live object to
become garbage.

The other part of the strategy that might be changed is the collection
frequency.  Right now, the threshold is 100 net allocations &
dealloactions.  On the compiler benchmark, this leads to some 2600
collections, which seems like a lot.  (I have no idea why it seems
like a lot, but it does.)

I experimented with a policy that runs the collected every N
deallocations (not counting deallocations the occur during a
collection).  I set N == 1000 and got 1600 collections on the compiler
benchmark.  

There is only a small speedup (just a few percent), so maybe this
change doesn't have a big effect.  I don't recall much about the
cost/complexity of various GC approaches.

Jeremy