[Python-ideas] checking for identity before comparing built-in objects
Max Moroz
maxmoroz at gmail.com
Thu Oct 4 13:48:03 CEST 2012
It seems that built-in classes do not short-circuit `__eq__` method
when the objects are identical, at least in CPython:
f = frozenset(range(200000000))
f1 = f
f1 == f # this operation will take about 1 sec on my machine
Is there any disadvantage to checking whether the equality was called
with the same object, and if it was, return `True` right away? I
noticed this when trying to memoize a function that has large
frozenset arguments. While hashing of a large argument is very fast
after it's done once (hash value is presumably cached), the equality
comparison is always slow even against itself. So when the same large
argument is provided over and over, memoization is slow.
Of course, there's a workaround: subclass frozenset, and redefine
__eq__ to check id() first. And arguably, for this particular use
case, I should redefine both __hash__ and __eq__, to make them only
look exclusively at id(), since it's not worth wasting memoizer time
trying to compare two non-identical large arguments that are highly
unlikely to compare equal anyway. So if there's any reason for the
current implementation, I don't have a strong argument against it.
More information about the Python-ideas
mailing list