[pypy-commit] pypy default: removing one of the minimark sections, because there are two identical minimark sections in this file.
noreply at buildbot.pypy.org
Sat Jun 18 11:17:02 CEST 2011
Author: David Schneider <david.schneider at picle.org>
Date: 2011-06-18 11:20 +0200
Log: removing one of the minimark sections, because there are two
identical minimark sections in this file.
diff --git a/pypy/doc/garbage_collection.rst b/pypy/doc/garbage_collection.rst
@@ -212,90 +212,4 @@
becomes free garbage, to be collected at the next major collection.
-This is a simplification and rewrite of the ideas from the Hybrid GC.
-It uses a nursery for the young objects, and mark-and-sweep for the old
-objects. This is a moving GC, but objects may only move once (from
-the nursery to the old stage).
-The main difference with the Hybrid GC is that the mark-and-sweep
-objects (the "old stage") are directly handled by the GC's custom
-allocator, instead of being handled by malloc() calls. The gain is that
-it is then possible, during a major collection, to walk through all old
-generation objects without needing to store a list of pointers to them.
-So as a first approximation, when compared to the Hybrid GC, the
-Minimark GC saves one word of memory per old object.
-There are a number of environment variables that can be tweaked to
-influence the GC. (Their default value should be ok for most usages.)
-You can read more about them at the start of
-In more details:
-- The small newly malloced objects are allocated in the nursery (case 1).
- All objects living in the nursery are "young".
-- The big objects are always handled directly by the system malloc().
- But the big newly malloced objects are still "young" when they are
- allocated (case 2), even though they don't live in the nursery.
-- When the nursery is full, we do a minor collection, i.e. we find
- which "young" objects are still alive (from cases 1 and 2). The
- "young" flag is then removed. The surviving case 1 objects are moved
- to the old stage. The dying case 2 objects are immediately freed.
-- The old stage is an area of memory containing old (small) objects. It
- is handled by `pypy/rpython/memory/gc/minimarkpage.py`_. It is organized
- as "arenas" of 256KB or 512KB, subdivided into "pages" of 4KB or 8KB.
- Each page can either be free, or contain small objects of all the same
- size. Furthermore at any point in time each object location can be
- either allocated or freed. The basic design comes from ``obmalloc.c``
- from CPython (which itself comes from the same source as the Linux
- system malloc()).
-- New objects are added to the old stage at every minor collection.
- Immediately after a minor collection, when we reach some threshold, we
- trigger a major collection. This is the mark-and-sweep step. It walks
- over *all* objects (mark), and then frees some fraction of them (sweep).
- This means that the only time when we want to free objects is while
- walking over all of them; we never ask to free an object given just its
- address. This allows some simplifications and memory savings when
- compared to ``obmalloc.c``.
-- As with all generational collectors, this GC needs a write barrier to
- record which old objects have a reference to young objects.
-- Additionally, we found out that it is useful to handle the case of
- big arrays specially: when we allocate a big array (with the system
- malloc()), we reserve a small number of bytes before. When the array
- grows old, we use the extra bytes as a set of bits. Each bit
- represents 128 entries in the array. Whenever the write barrier is
- called to record a reference from the Nth entry of the array to some
- young object, we set the bit number ``(N/128)`` to 1. This can
- considerably speed up minor collections, because we then only have to
- scan 128 entries of the array instead of all of them.
-- As usual, we need special care about weak references, and objects with
- finalizers. Weak references are allocated in the nursery, and if they
- survive they move to the old stage, as usual for all objects; the
- difference is that the reference they contain must either follow the
- object, or be set to NULL if the object dies. And the objects with
- finalizers, considered rare enough, are immediately allocated old to
- simplify the design. In particular their ``__del__`` method can only
- be called just after a major collection.
-- The objects move once only, so we can use a trick to implement id()
- and hash(). If the object is not in the nursery, it won't move any
- more, so its id() and hash() are the object's address, cast to an
- integer. If the object is in the nursery, and we ask for its id()
- or its hash(), then we pre-reserve a location in the old stage, and
- return the address of that location. If the object survives the
- next minor collection, we move it there, and so its id() and hash()
- are preserved. If the object dies then the pre-reserved location
- becomes free garbage, to be collected at the next major collection.
.. include:: _ref.txt
More information about the pypy-commit