Why custom objects take so much memory?

Colin J. Williams cjw at sympatico.ca
Wed Dec 19 20:01:26 EST 2007


Hrvoje Niksic wrote:
> Steven D'Aprano <steven at REMOVE.THIS.cybersource.com.au> writes:
> 
>> On Tue, 18 Dec 2007 21:13:14 +0100, Hrvoje Niksic wrote:
>>
>>> Each object takes 36 bytes itself: 4 bytes refcount + 4 bytes type ptr +
>>> 4 bytes dict ptr + 4 bytes weakptr + 12 bytes gc overhead.  That's not
>>> counting malloc overhead, which should be low since objects aren't
>>> malloced individually.  Each object requires a dict, which consumes
>>> additional 52 bytes of memory (40 bytes for the dict struct plus 12 for
>>> gc).  That's 88 bytes per object, not counting malloc overhead.
>> And let's not forget that if you're running on a 64-bit system, you
>> can double the size of every pointer.
> 
> And of Py_ssize_t's, longs, ints with padding (placed between two
> pointers).  Also note the price of 8-byte struct alignment.
> 
>> Is there a canonical list of how much memory Python objects take up?
>> Or a canonical algorithm?
>>
>> Or failing either of those, a good heuristic?
> 
> For built-in types, you need to look at the code of each individual
> object.  For user types, you can approximate by calculations such as
> the above.

It would be helpful if there were a 
tabulation of the memory cost for
each built-in type.

Colin W.
> 
>>> Then there's string allocation: your average string is 6 chars
>>> long; add to that one additional char for the terminating zero.
>> Are you sure about that? If Python strings are zero terminated, how
>> does Python deal with this?
>>
>>>>> 'a\0string'[1]
>> '\x00'
> 
> Python strings are zero-terminated so the pointer to string's data can
> be passed to the various C APIs (this is standard practice, C++
> strings do it too.)  Python doesn't rely on zero termination to
> calculate string length.  So len('a\0string') will do the right thing,
> but the string will internally store 'a\0string\0'.




More information about the Python-list mailing list