Why custom objects take so much memory?
Colin J. Williams
cjw at sympatico.ca
Wed Dec 19 20:01:26 EST 2007
Hrvoje Niksic wrote:
> Steven D'Aprano <steven at REMOVE.THIS.cybersource.com.au> writes:
>
>> On Tue, 18 Dec 2007 21:13:14 +0100, Hrvoje Niksic wrote:
>>
>>> Each object takes 36 bytes itself: 4 bytes refcount + 4 bytes type ptr +
>>> 4 bytes dict ptr + 4 bytes weakptr + 12 bytes gc overhead. That's not
>>> counting malloc overhead, which should be low since objects aren't
>>> malloced individually. Each object requires a dict, which consumes
>>> additional 52 bytes of memory (40 bytes for the dict struct plus 12 for
>>> gc). That's 88 bytes per object, not counting malloc overhead.
>> And let's not forget that if you're running on a 64-bit system, you
>> can double the size of every pointer.
>
> And of Py_ssize_t's, longs, ints with padding (placed between two
> pointers). Also note the price of 8-byte struct alignment.
>
>> Is there a canonical list of how much memory Python objects take up?
>> Or a canonical algorithm?
>>
>> Or failing either of those, a good heuristic?
>
> For built-in types, you need to look at the code of each individual
> object. For user types, you can approximate by calculations such as
> the above.
It would be helpful if there were a
tabulation of the memory cost for
each built-in type.
Colin W.
>
>>> Then there's string allocation: your average string is 6 chars
>>> long; add to that one additional char for the terminating zero.
>> Are you sure about that? If Python strings are zero terminated, how
>> does Python deal with this?
>>
>>>>> 'a\0string'[1]
>> '\x00'
>
> Python strings are zero-terminated so the pointer to string's data can
> be passed to the various C APIs (this is standard practice, C++
> strings do it too.) Python doesn't rely on zero termination to
> calculate string length. So len('a\0string') will do the right thing,
> but the string will internally store 'a\0string\0'.
More information about the Python-list
mailing list