[Python-Dev] stackable ints [stupid idea (ignore) :v]

Guido van Rossum guido at CNRI.Reston.VA.US
Thu Jun 10 16:11:23 CEST 1999


[Aaron]
> I've always considered it a major shame that Python ints and floats
> and chars and stuff have anything to do with dynamic allocation, and
> I always suspected it might be a major speed performance boost if
> there was some way they could be manipulated without the need for
> dynamic memory management.

What you're describing is very close to what I recall I once read
about the runtime organization of Icon.  Perl may also use a variant
on this (it has fixed-length object headers).  On the other hand, I
believe Smalltalks typically uses something like the following ABC
trick:

In ABC, we used a variation: objects were represented by pointers as
in Python, except when the low bit was 1, in which case the remaining
31 bits were a "small int".  My experience with this approach was that
it probably saved some memory, but perhaps not time (since almost all
operations on objects were slowed down by the check "is it an int?"
before the pointer could be accessed); and that because of this it was
a major hassle in keeping the implementation code correct.  There was
always the temptation to make a check early in a piece of code and
then skip the check later on, which sometimes didn't work when objects
switched places.  Plus in general the checks made the code less
readable, and it was just one more thing to remember to do.

The Icon approach (i.e. yours) seems to require a complete rethinking
of all object implementations and all APIs at the C level -- perhaps
we could think about it for Python 2.0.  Some ramifications:

- Uses more memory for highly shared objects (there are as many copies 
of the type pointer as there are references).

- Thus, lists take double the memory assuming they reference objects
that also exist elsewhere.  This affects the performance of slices
etc.

- On the other hand, a list of ints takes half the memory (given that
most of those ints are not shared).

- *Homogeneous* lists (where all elements have the same type --
i.e. arrays) can be represented more efficiently by having only one
copy of the type pointer.  This was an idea for ABC (whose type system 
required all container types to be homogenous) that was never
implemented (because in practice the type check wasn't always applied, 
and the top-level namespace used by the interactive command
interpreter violated all the rules).

- Reference count manipulations could be done by a macro (or C++
behind-the-scense magic using copy constructors and destructors) that
calls a function in the type object -- i.e. each object could decide
on its own reference counting implementation :-)

--Guido van Rossum (home page: http://www.python.org/~guido/)




More information about the Python-Dev mailing list