Planning a Python Course for Beginners
Peter Otten
__peter__ at web.de
Thu Aug 10 16:35:00 EDT 2017
Marko Rauhamaa wrote:
> Peter Otten <__peter__ at web.de>:
>
>> Steve D'Aprano wrote:
>>> The C code says:
>>>
>>>>/* bottom 3 or 4 bits are likely to be 0; rotate y by 4 to avoid
>>>>excessive hash collisions for dicts and sets */
>>>
>>> which I think agrees with my comment: using the id() itself would put
>>> too many objects in the same bucket (i.e. too many collisions).
>>
>> There's a subtle diffence: you expected objects with nearby memory
>> addresses to end up in the same "bucket" while actually all addresses
>> (are likely to) have the same low bits, and creation time does not
>> matter.
>
> I see no point in CPython's rotation magic.
Let's see:
$ cat hashperf.py
class A(object):
__slots__ = ["_hash"]
def __hash__(self):
return self._hash
def no_magic():
a = A()
a._hash = id(a)
return a
def magic():
a = A()
a._hash = id(a) >> 4
return a
$ python3 -m timeit -s 'from hashperf import magic, no_magic; s =
{no_magic() for _ in range(10**5)}' 'for x in s: x in s'
10 loops, best of 3: 70.7 msec per loop
$ python3 -m timeit -s 'from hashperf import magic, no_magic; s = {magic()
for _ in range(10**5)}' 'for x in s: x in s'
10 loops, best of 3: 52.8 msec per loop
"magic" wins this makeshift test. Other than that you're right ;)
More information about the Python-list
mailing list