Python's garbage collection was Re: Python reliability
twic at urchin.earth.li
Thu Oct 13 01:38:41 CEST 2005
On Wed, 12 Oct 2005, Jorgen Grahn wrote:
> On Mon, 10 Oct 2005 20:37:03 +0100, Tom Anderson <twic at urchin.earth.li> wrote:
>> On Mon, 10 Oct 2005, it was written:
>>> There is no way you can avoid making garbage. Python conses everything,
>>> even integers (small positive ones are cached).
>> So python doesn't use the old SmallTalk 80 SmallInteger hack, or similar?
> If the SmallInteger hack is something like this, it does:
>>>> a = 42
>>>> b = 42
>>>> a is b
>>>> a = 42000
>>>> b = 42000
>>>> a is b
> ... which I guess is what if referred to above as "small positive
> ones are cached".
That's not what i meant.
In both smalltalk and python, every single variable contains a reference
to an object - there isn't the object/primitive distinction you find in
less advanced languages like java.
Except that in smalltalk, this isn't true: in ST, every variable *appears*
to contain a reference to an object, but implementations may not actually
work like that. In particular, SmallTalk 80 (and some earlier smalltalks,
and all subsequent smalltalks, i think) handles small integers (those that
fit in wordsize-1 bits) differently: all variables contain a word, whose
bottom bit is a tag bit; if it's one, the word is a genuine reference, and
if it's zero, the top bits of the word contain a signed integer. The
innards of the VM know about this (where it matters), and do the right
thing. All this means that small (well, smallish - up to a billion!)
integers can be handled with zero heap space and much reduced instruction
counts. Of course, it means that references are more expensive, since they
have to be checked for integerness before dereferencing, but since this is
a few instructions at most, and since small integers account for a huge
fraction of the variables in most programs (as loop counters, array
indices, truth values, etc), this is a net win.
See the section 'Representation of Small Integers' in:
The precise implementation is sneaky - the tag bit for an integer is zero,
so in many cases you can do arithmetic directly on the word, with a few
judicious shifts here and there; the tag bit for a pointer is one, and the
pointer is stored in two's-complement form *with the bottom bit in the
same place as the tag bit*, so you can recover a full-length pointer from
the word by complementing the whole thing, rather than having to shift.
Since pointers are word-aligned, the bottom bit is always a zero, so in
the complement it's always a one, so it can also be the status bit!
I think this came from LISP initially (most things do) and was probably
invented by Guy Steele (most things were).
That's no moon!
More information about the Python-list