[Python-Dev] stackable ints [stupid idea (ignore) :v]

Guido van Rossum guido at CNRI.Reston.VA.US
Mon Jun 14 17:12:33 CEST 1999


[me]
> > - Thus, lists take double the memory assuming they reference objects
> > that also exist elsewhere.  This affects the performance of slices
> > etc.
> >
> > - On the other hand, a list of ints takes half the memory (given that
> > most of those ints are not shared).

[Tim]
> Isn't this 2/3 rather than 1/2?  I'm picturing a list element today as
> essentially a pointer to a type object pointer + int (3 units in all), and a
> type object pointer + int (2 units in all) "tomorrow".  Throw in refcounts
> too and the ratio likely gets closer to 1.

An int is currently 3 units: type, refcnt, value.  (The sepcial int
allocator means that there's no malloc overhead.)  A list item is one
unit.  So a list of N ints is 4N units (+ overhead).  In the proposed
scheme, there would be 2 units.  That makes a factor of 1/2 for me...

> Well, Python already has homogeneous int lists (array.array), and while they
> save space they suffer in speed due to needing to wrap raw ints "in an
> object" upon reference and unwrap them upon storage.

Which would become faster with the proposed scheme since it would not
require any heap allocation (presuming 2-unit structs can be passed
around as function results).

> > - Reference count manipulations could be done by a macro (or C++
> > behind-the-scense magic using copy constructors and destructors) that
> > calls a function in the type object -- i.e. each object could decide
> > on its own reference counting implementation :-)
> 
> You don't need to switch representations to get that, though, right?  That
> is, I don't see anything stopping today's type objects from growing
> __incref__ and __decref__ slots -- except for common sense <wink>.

Eh, indeed <blush>.

> An apparent ramification I don't see above that may actually be worth
> something <wink>:
> 
> - In "i = j + k", the eval stack could contain the ints directly, instead of
> pointers to the ints.  So fetching the value of i takes two loads (get the
> type pointer + the variant) from adjacent stack locations, instead of
> today's load-the-pointer + follow-the-pointer (to some other part of
> memory); similarly for fetching the value of j.  Then the sum can be stored
> *directly* into the stack too, without today's need for allocating and
> wrapping it in "an int object" first.

I though this was assumed all the time?  I mentioned "no heap
allocation" above before I read this.  I think this is the reason why
it was proposed at all: things for which the value fits in a unit
don't live on the heap at all, *without* playing tricks with pointer
representations.

> Possibly happy variant:  on top of the above, *don't* exempt ints from
> refcounting.  Let 'em incref and decref like everything else.  Give them an
> intial refcount of max_count/2, and in the exceedingly unlikely event a
> decref on an int ever sees zero, the int "destructor" simply resets the
> refcount to max_count/2 and is otherwise a nop.

Don't get this -- there's no object on the heap to hold the refcnt.

--Guido van Rossum (home page: http://www.python.org/~guido/)




More information about the Python-Dev mailing list