[pypy-commit] pypy concurrent-marksweep: Add a textual introduction. Needs to think out more precisely
arigo
noreply at buildbot.pypy.org
Sun Dec 25 20:37:54 CET 2011
Author: Armin Rigo <arigo at tunes.org>
Branch: concurrent-marksweep
Changeset: r50863:b0bb363299dd
Date: 2011-12-25 20:37 +0100
http://bitbucket.org/pypy/pypy/changeset/b0bb363299dd/
Log: Add a textual introduction. Needs to think out more precisely what
it implies for the rest of the document.
diff --git a/pypy/rpython/memory/gc/concurrentgen.txt b/pypy/rpython/memory/gc/concurrentgen.txt
--- a/pypy/rpython/memory/gc/concurrentgen.txt
+++ b/pypy/rpython/memory/gc/concurrentgen.txt
@@ -1,14 +1,41 @@
+============================================================
+ Overview of the "concurrentgen" collector
+============================================================
+
+Goal: reduce the total real time by moving a part of the GC to its own
+thread that can run in parallel with the main execution thread.
+
+On current modern hardware with at least two cores, the two cores can
+read the same area of memory concurrently. If one of the cores writes
+to this area, then I believe that the core doing the writing works at
+full speed, whereas the core doing the reading suffers from waiting for
+the data to move to it; but it's still ok because the data usually moves
+in a cache-to-cache bus, not via the main memory. Also, if an area of
+memory is written to by one core, and then read and written to by the
+other core only, then performance is fine. The bad case is the one in
+which both cores continously read and write the same area of memory.
+
+So, assuming that the main thread reads and writes to random objects all
+the time, it means that the GC thread should *only read* from the
+objects. Conversely, the data structures built by the GC thread should
+only be *read* from the main thread. In particular: when the GC thread
+does marking, it should use off-objects bits; and sweeping should be
+done by adding free objects to lists that are not chained lists. In
+this way the GC thread never writes to the object's memory. Similarly,
+for the same reason, the GC thread should not reset areas of memory to
+zero in the background.
+
+
************************************************************
Minor collection cycles of the "concurrentgen" collector
************************************************************
-
Objects mark byte:
cym in 'mK': young objs (and all flagged objs)
cam in 'Km': aging objs
- '#' : old objs
- 'S' : static prebuilt objs with no heap pointer
+ '#' '/' : old objs
+ '5' : static prebuilt objs with no heap pointer
cym = current_young_marker
cam = current_aging_marker
@@ -29,7 +56,7 @@
Write barrier: change "old obj" to "flagged obj"
(if mark != cym:
- mark = cym (used to be '#' or 'S')
+ mark = cym (used to be '#' or '5')
record the object in the "flagged" list)
- note that we consider that flagged old objs are again young objects
@@ -72,7 +99,7 @@
trace and add to gray objs)
- also flag old-or-aging objs that point to new young objs
(if mark != cym:
- mark = cym (used to be '#' or 'S')
+ mark = cym (used to be '#' or '5')
record the object in the "flagged" list)
Threading issues:
@@ -99,7 +126,7 @@
if obj is "black": (i.e. if mark != cam)
make the obj old ( nothing to do here, mark already ok)
else:
- clear the object space and return it to the available list
+ return the object to the available list
after this there are no more aging objects
Write barrier:
@@ -107,4 +134,10 @@
- flag old objs that point to new young objs
(should not see any 'cam' object any more here)
-------------------------------------------------------------
+
+
+************************************************************
+ MAJOR collection cycles of the "concurrentgen" collector
+************************************************************
+
+NotImplementedError
More information about the pypy-commit
mailing list