A couple garbage collector questions
Douglas Alan
nessus at mit.edu
Tue Apr 3 00:33:07 EDT 2001
kragen at dnaco.net (Kragen Sitaker) writes:
> Douglas Alan <nessus at mit.edu> wrote:
> >The fact that Python uses referencing counting for its front-line GC
> >has some very desirable properties. Two come immediately to mind:
> > (1) Predictable destructor semantics. This is in contrast to Java,
> > in which destructors are useless because you don't know if and
> > when they will ever be called.
> I mostly disagree. If the timing of destructor calls was really
> predictable, you'd just call them yourself instead of depending on
> the GC to do it for you.
No I wouldn't. I don't like writing code when the language can do
things for me automatically and save me the work.
> If you know you're done with a file, for example,
> you can close it;
I could. But I'd prefer the language to close it for me.
> if no part of your program knows when to close it (because more than
> one part of your program is using it, and they stop using it in an
> indeterminate order) then you don't really know when the destructor
> will be called, either.
I know that the destructor will be called as soon as the object is no
longer needed. That is all the predictability that I require.
> It *is* more deterministic than a real GC, but it's still a long way
> from "predictable" by my lights.
It's predictable enough to do the job the way that I want it to be
done, while with a "real GC", it isn't.
> > (2) No long GC pauses.
> Again, I mostly disagree. You get long GC pauses when the last
> reference to any large, complex data structure (or a data structure
> with slow destructors) gets overwritten or goes out of scope.
When such pauses occur is much more predictable than the pauses that
occur with a non-realtime GC, and such pauses are typically much
shorter. Realtime GC's are cool, but my impression is that they are
fairly expensive and hairy to implement.
> >You might be able to make a "real garbage collector" that has these
> >properties, but it would, I believe, be pretty complicated.
> You can't even make a reference-counting GC that has these
> properties --- you need C++-style (Limbo-style?) auto_ptrs.
auto_ptrs are not remotely like GC, so I don't see how this is
relevant. Reference counting gives you most of the benefits of a real
GC, while preserving the usefulness of destructors. auto_ptrs are
merely a shorthand so that you don't have to explicitly free storage.
> > Since referencing-counting typically gets rid of 99.9% of all garbage,
> Reference-counting typically gets rid of 100% of all garbage.
Let's not quibble over .1%?
> But sometimes it gets rid of 0% of all garbage, or (more often)
> somewhere in between. Which garbage it will actually collect is a
> global property of your program and cannot always be deduced from
> looking at the code that handles that garbage.
The operative words are "cannot *always* be deduced". This is
consistent with "can *often* be deduced". The vast majority of useful
programs do not require a real GC. When you need one, however, it is
*very* nice to have a real GC. Consequently, I am very happy that
they finally added one to Python.
> > backing it up with a simple mark & sweep GC seems adequate for
> > most purposes.
> Reference-counting exacts very heavy performance costs, no matter
> what you back it up with.
What's so heavy about its performance cost? Will a mark & sweep GC
perform better on average? I'm skeptical. A compacting, copying
garbage collector probably will, but it can double the space cost of
your program.
|>oug
More information about the Python-list
mailing list