[Python-3000] Delayed reference counting idea

Marcin 'Qrczak' Kowalczyk qrczak at knm.org.pl
Tue Sep 19 18:37:55 CEST 2006


Barry Warsaw <barry at python.org> writes:

> It's not external resources I'm concerned about, it's the physical
> memory consumed in process by objects which are unreachable but not
> reclaimed.

The rate of garbage collection depends on the rate of allocation.
While the objects are not freed in the earliest possible moment,
they are freed if the memory is needed for other objects.


Brian Quinlan <brian at sweetapp.com> writes:

> Do somehow know that tracing GC would be more efficient for typical 
> python programs or are you just speculating?

I'm mostly speculating. It's hard to measure the difference between
garbage collection schemes because most language runtimes are tied
to a particular GC implementation, and thus you can't substitute a
different GC leaving everything else the same.

I've done some experiments with C++-based GCs incl. reference counting,
but they were inconclusive. The effects strongly depend on the kind of
the program and the amount of memory it uses, and various GC schemes
are better or worse in different scenarios.

>> It involves operations every time an object is merely passed around,
>> as references to the object are created or destroyed.
>
> But if the lifetime of most objects is confined to a single function 
> call, isn't reference counting going to be quite efficient?

Even if an object begins and ends its lifetime within a particular
function call, it's usually passed down to other functions in the
meantime.

Every time a Python function returns an object, the reference count
on the result is incremented, and it's decremented at some time by
its caller. Every time a function implemented in Python is called,
reference counts of its parameters are incremented, and they are
decremented when it returns. Every time a None is stored in a data
structure or returned from a function, its reference count is
incremented. Every time a list is freed, reference counts of objects
it refers to is decremented. Every time two ints are added, the
reference count of the result is incremented even if that integer
was preallocated. Every time a field is assigned to, two reference
counts are manipulated.

>> It doesn't move objects in memory, and thus free memory is fragmented.
>
> OK. Have you had memory fragmentation problems with Python?

Indirectly: memory allocation can't be as fast as in some GC schemes.

>> Memory allocation can't just chop from from a single area of free memory.
>> It can't allocate several objects with the cost of one allocation either.
>
> I'm not sure what you mean here.

There are various GCs (including OCaml, and probably Sun's Java and
Microsoft's .NET implementations, and my language implementation,
and surely others) where the fast path of memory allocation looks
like stack allocation with overflow checking.

Moreover, if several objects are to be allocated at once (which I
admit is more likely in compiled code), the cost is still the same
as allocation of one object of the size being the sum of sizes of the
objects (not counting filling the objects with contents). There are no
per-allocated-object data to fill besides a header which points to a
static structure which describes the layout.

-- 
   __("<         Marcin Kowalczyk
   \__/       qrczak at knm.org.pl
    ^^     http://qrnik.knm.org.pl/~qrczak/


More information about the Python-3000 mailing list