[issue5186] Reduce hash collisions for objects with no __hash__ method

Antoine Pitrou report at bugs.python.org
Tue Feb 10 22:18:33 CET 2009


Antoine Pitrou <pitrou at free.fr> added the comment:

Some tests on py3k (32-bit build):

>>> l = [object() for i in range(20)]
>>> [id(l[i+1]) - id(l[i]) for i in range(len(l)-1)]
[16, -96, 104, 8, 8, 8, 8, 8, -749528, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8]
>>> class C(object):
...    __slots__ = ()
... 
>>> l = [C() for i in range(20)]
>>> [id(l[i+1]) - id(l[i]) for i in range(len(l)-1)]
[-104, 24, 8, 8, 8, 8, 8, 8, 8, 8, 8, 16, 8, 8, 8, 8, 16, -8, 16]
>>> class C(object):
...    __slots__ = ('x')
... 
>>> l = [C() for i in range(20)]
>>> [id(l[i+1]) - id(l[i]) for i in range(len(l)-1)]
[432, 24, -384, 408, 24, 24, -480, 528, 24, 24, 24, 24, 48, -360, 504,
24, -480, 552, 24]

So, as soon as an user-defined type isn't totally trivial, it is
allocated in at least 24-byte memory units. Shifting by 4 shouldn't be
detrimental performance-wise, unless you allocate lots of purely empty
object() instances...

Note: a 64-bit build shows an even greater allocation unit:

>>> class C(object):
...    __slots__ = ('x')
... 
>>> l = [C() for i in range(20)]
>>> [id(l[i+1]) - id(l[i]) for i in range(len(l)-1)]
[56, -112, 168, 56, 56, 56, 56, 56, 56, 56, 56, 56, 56, 56, 56, 56, 56,
56, 56]

I wonder why the allocation unit is 56 and not 48 (2*24).

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue5186>
_______________________________________


More information about the Python-bugs-list mailing list