On Wed, Jun 25, 2008 at 4:55 PM, "Martin v. Löwis" <martin@v.loewis.de> wrote:
> It seems to me that the root problem is allocation spikes of legitimate,
> useful data. Perhaps then we need some sort of "test" to determine if
> those are legitimate. Perhaps checking every nth (with n decreasing as
> allocation bytes increases) object allocated during a "spike" could be
> useful. Then delay garbage collection until x consecutive objects are
> found to be garbage?
>
> It seems like we should be attacking the root cause rather than finding
> some convoluted math that attempts to work for all scenarios.

I think exactly the other way 'round. The timing of thing should not
matter at all, only the exact sequence of allocations and deallocations.

I trust provable maths much more than I trust ad-hoc heuristics, even
if you think the math is convoluted.

I probably chose my wording poorly (particularly for a newcomer/outsider). What I meant was that the numbers used in GC currently appear arbitrary. The idea of three "groups" (youngest, oldest and middle) is also arbitrary. Would it not be better to tear that system apart and create a "sliding" scale. If the timing method is undesirable then make it slide based on the allocation/deallocation difference. In this way, the current breakpoints and number of groups (all of which are arbitrary and fixed) could be replaced by one coefficient (and yes, I recognize that it would also be arbitrary but it would be one, tweakable number rather than several).

My gut tells me that your current fix is going to work just fine for now but we're going to end up tweaking it (or at least discussing tweaking it) every 6-12 months.


> On a side note, the information about not GCing on string objects is
> interesting? Is there a way to override this behavior?

I think you misunderstand. Python releases unused string objects just
fine, and automatically. It doesn't even need GC for that.

I took the statement, "Current GC only takes into account container objects, which, most significantly, ignores string objects (of which most applications create plenty)" to mean that strings were ignored for deciding when to do garbage collection. I mistakenly thought that was because they were assumed to be small. It sounds like they're ignored because they're automatically collected and so they SHOULD be ignored for object garbage collection. Thanks for the clarification... Back to the drawing board on my other problem ;)

--
Haikus are easy
Most make very little sense
Refrigerator