[Tutor] Why is an instance smaller than the sum of its components?
Peter Otten
__peter__ at web.de
Tue Feb 3 23:40:03 CET 2015
Jugurtha Hadjar wrote:
> Hello,
>
> I was writing something and thought: Since the class had some
> 'constants', and multiple instances would be created, I assume that each
> instance would have its own data. So this would mean duplication of the
> same constants? If so, I thought why not put the constants in memory
> once, for every instance to access (to reduce memory usage).
CPython already does this for many common values, e. g. small integers and
variable names
>>> a = 42
>>> b = 42
>>> a is b
True
>>> a = 300
>>> b = 300
>>> a is b
False
It also performs constant folding over a single compilation:
>>> a = 300; b = 300; a is b
True
> Correct me if I'm wrong in my assumptions (i.e: If instances share stuff).
>
> So I investigated further..
>
> >>> import sys
> >>> sys.getsizeof(5)
> 12
>
>
> So an integer on my machine is 12 bytes.
>
> Now:
>
> >>> class foo(object):
> ... def __init__(self):
> ... pass
>
> >>> sys.getsizeof(foo)
> 448
>
> >>> sys.getsizeof(foo())
> 28
>
> >>> foo
> <class '__main__.foo'>
> >>> foo()
> <__main__.foo object at 0xXXXXXXX
To know its class the foo instance only needs a reference to the class
object, not a complete copy of the class. This is typically provided by
putting a pointer into the instance, and this consumes only 8 bytes on 64-
bit systems.
> - Second weird thing:
>
> >>> class bar(object):
> ... def __init__(self):
> ... self.w = 5
> ... self.x = 6
> ... self.y = 7
> ... self.z = 8
>
> >>> sys.getsizeof(bar)
> 448
> >>> sys.getsizeof(foo)
> 448
> >>> sys.getsizeof(bar())
> 28
> >>> sys.getsizeof(foo())
> 28
>
> >>> sys.getsizeof(bar().x)
> 12
> >>> sys.getsizeof(bar().y)
> 12
The instances of most user-defined classes use a dict to hold references to
the attributes, and only the memory consumed by the pointer to that dict is
reported. If all referenced values were included, what would you expect to
be the size of `a` below
>>> class A: pass
...
>>> a = A()
>>> a.a = a
>>> sys.getsizeof(a)
56
> Summary questions:
>
> 1 - Why are foo's and bar's class sizes the same? (foo's just a nop)
Assigning attributes in the initializer only affects the instance; Python
doesn't scan the code to reserve slots for these attributes. When the
instance is created the initialiser is executed and every
self.x = value
basically results in
self.__dict__["x"] = value
> 2 - Why are foo() and bar() the same size, even with bar()'s 4 integers?
To get a more realistic idea of the objects' size you can include the size
of __dict__. That grows in bursts:
>>> class A: pass
...
>>> a = A()
>>> old_size = sys.getsizeof(a.__dict__)
>>> for i in range(1000):
... setattr(a, "x" + str(i), i)
... new_size = sys.getsizeof(a.__dict__)
... if new_size != old_size:
... print(i, "a.__dict__ grows from", old_size, "to", new_size)
... old_size = new_size
...
3 a.__dict__ grows from 96 to 192
11 a.__dict__ grows from 192 to 320
21 a.__dict__ grows from 320 to 576
43 a.__dict__ grows from 576 to 1088
85 a.__dict__ grows from 1088 to 2112
171 a.__dict__ grows from 2112 to 4160
341 a.__dict__ grows from 4160 to 8256
683 a.__dict__ grows from 8256 to 16448
> 3 - Why's bar()'s size smaller than the sum of the sizes of 4 integers?
>From the above follows that this is not a meaningful question for a standard
user-defined class. They all use __dict__ to store the interesting stuff.
But there is a way to reserve space for attributes in the instance:
>>> class A:
... __slots__ = ()
...
>>> class B:
... __slots__ = ("a",)
...
>>> class C:
... __slots__ = ("a", "b")
...
>>> class D:
... __slots__ = tuple("a" + str(i) for i in range(100))
...
>>> for K in A, B, C, D:
... print(K.__name__, sys.getsizeof(K()))
...
A 16
B 48
C 56
D 840
>>> b = B()
>>> sys.getsizeof(b)
48
>>> b.a = 42
>>> sys.getsizeof(b)
48
>>> (840-48)/99
8.0
And that looks very much like one 64-bit pointer per attribute.
More information about the Tutor
mailing list