[New-bugs-announce] [issue39143] Implementing sub-generation steps in the gc
Pablo Galindo Salgado
report at bugs.python.org
Fri Dec 27 16:35:04 EST 2019
New submission from Pablo Galindo Salgado <pablogsal at gmail.com>:
While I was re-reading The Garbage Collection Handbook [Moss, Hosking, Jones], I have been doing some benchmarks of different aspects of our garbage collection and taking statistics using the pyperformance suite as long as several "production" applications that heavily creates objects. I have observed that our
collector presents a form of "generational nepotism" in some typical scenarios. The most common reason is
that when promotions of the youngest generations happen, some very young objects that just arrived in the generation are promoted because we have reached a threshold, and its death will be delayed. These objects
will likely die immediately if the promotion was delayed a bit (think of a burst of temporary objects being created at once) or if the promotion could distinguish "age" with more granularity.
The book proposes serval solutions to this problem, and one of the simpler ones taking the
architecture of our collector is to use sub-generation steps:
> Promotion can be delayed by structuring a generation into two or more aging spaces. This
allows objects to be copied between the fromspace and tospace an arbitrary number of
times within the generation before they are promoted. In Lieberman and Hewitt's original
generational collector , a generation is collected several times before all survivors
are eventually promoted en masse. In terms of the aging semispaces of Figure 9.2b, ei
ther all live objects in fromspace are evacuated to tospace within this generation or all
are promoted to the next generation, depending on the age of the generation as a whole.
Basically, the differences between steps and generations are that both segregate objects by age,
but different generations are collected at different frequencies whereas all the steps of a
generation are collected at the same time. By using steps in the youngest generation (where
most mutation occurs), and by reducing premature collection, the load on the write barrier
can be reduced while also controlling promotion, without need for per-object age records.
What do you think about implementing sub-generation steps? Maybe only on the youngest generation?
A "lazy" way of implementing this (although without the semantic of "steps") is adding more intermediate generations with the same threshold (but this likely won't yield the same benefits).
components: Interpreter Core
nosy: nascheme, pablogsal, tim.peters
title: Implementing sub-generation steps in the gc
versions: Python 3.9
Python tracker <report at bugs.python.org>
More information about the New-bugs-announce