[Tutor] Why is an instance smaller than the sum of its components?

Peter Otten __peter__ at web.de
Tue Feb 3 23:40:03 CET 2015


Jugurtha Hadjar wrote:

> Hello,
> 
> I was writing something and thought: Since the class had some
> 'constants', and multiple instances would be created, I assume that each
> instance would have its own data. So this would mean duplication of the
> same constants? If so, I thought why not put the constants in memory
> once, for every instance to access (to reduce memory usage).

CPython already does this for many common values, e. g. small integers and 
variable names


>>> a = 42
>>> b = 42
>>> a is b
True
>>> a = 300
>>> b = 300
>>> a is b
False

It also performs constant folding over a single compilation:

>>> a = 300; b = 300; a is b
True

> Correct me if I'm wrong in my assumptions (i.e: If instances share stuff).
> 
> So I investigated further..
> 
>  >>> import sys
>  >>> sys.getsizeof(5)
> 12
> 
> 
> So an integer on my machine is 12 bytes.
> 
> Now:
> 
>  >>> class foo(object):
> ...	def __init__(self):
> ...		pass
> 
>  >>> sys.getsizeof(foo)
> 448
> 
>  >>> sys.getsizeof(foo())
> 28
> 
>  >>> foo
> <class '__main__.foo'>
>  >>> foo()
> <__main__.foo object at 0xXXXXXXX

To know its class the foo instance only needs a reference to the class 
object, not a complete copy of the class. This is typically provided by 
putting a pointer into the instance, and this consumes only 8 bytes on 64-
bit systems.


> - Second weird thing:
> 
>  >>> class bar(object):
> ...	def __init__(self):
> ...		self.w = 5
> ...		self.x = 6
> ...		self.y = 7
> ...		self.z = 8
> 
>  >>> sys.getsizeof(bar)
> 448
>  >>> sys.getsizeof(foo)
> 448
>  >>> sys.getsizeof(bar())
> 28
>  >>> sys.getsizeof(foo())
> 28
> 
>  >>> sys.getsizeof(bar().x)
> 12
>  >>> sys.getsizeof(bar().y)
> 12

The instances of most user-defined classes use a dict to hold references to 
the attributes, and only the memory consumed by the pointer to that dict is 
reported. If all referenced values were included, what would you expect to 
be the size of `a` below

>>> class A: pass
... 
>>> a = A()
>>> a.a = a
>>> sys.getsizeof(a)
56


> Summary questions:
> 
> 1 - Why are foo's and bar's class sizes the same? (foo's just a nop)

Assigning attributes in the initializer only affects the instance; Python 
doesn't scan the code to reserve slots for these attributes. When the 
instance is created the initialiser is executed and every

self.x = value

basically results in 

self.__dict__["x"] = value

> 2 - Why are foo() and bar() the same size, even with bar()'s 4 integers?

To get a more realistic idea of the objects' size you can include the size 
of __dict__. That grows in bursts:

>>> class A: pass
... 
>>> a = A()
>>> old_size = sys.getsizeof(a.__dict__)
>>> for i in range(1000):
...     setattr(a, "x" + str(i), i)
...     new_size = sys.getsizeof(a.__dict__)
...     if new_size != old_size:
...         print(i, "a.__dict__ grows from", old_size, "to", new_size)
...         old_size = new_size
... 
3 a.__dict__ grows from 96 to 192
11 a.__dict__ grows from 192 to 320
21 a.__dict__ grows from 320 to 576
43 a.__dict__ grows from 576 to 1088
85 a.__dict__ grows from 1088 to 2112
171 a.__dict__ grows from 2112 to 4160
341 a.__dict__ grows from 4160 to 8256
683 a.__dict__ grows from 8256 to 16448

> 3 - Why's bar()'s size smaller than the sum of the sizes of 4 integers?

>From the above follows that this is not a meaningful question for a standard 
user-defined class. They all use __dict__ to store the interesting stuff.

But there is a way to reserve space for attributes in the instance:

>>> class A:
...     __slots__ = ()
... 
>>> class B:
...     __slots__ = ("a",)
... 
>>> class C:
...     __slots__ = ("a", "b")
... 
>>> class D:
...     __slots__ = tuple("a" + str(i) for i in range(100))
... 
>>> for K in A, B, C, D:
...     print(K.__name__, sys.getsizeof(K()))
... 
A 16
B 48
C 56
D 840
>>> b = B()
>>> sys.getsizeof(b)
48
>>> b.a = 42
>>> sys.getsizeof(b)
48

>>> (840-48)/99
8.0

And that looks very much like one 64-bit pointer per attribute.




More information about the Tutor mailing list