Re: [Python-Dev] Why not using the hash when comparing strings?

19 Oct 2012


      On 19/10/12 12:03, Victor Stinner wrote:
...
Hi,
I would like to know if there a reason for not using the hash of
(bytes or unicode) strings when comparing two objects and the hash of
the two objects was already been computed. Using the hash would speed
up comparaison of long strings when the two strings are different.
Assuming the hash has already been compared, then I imagine it would be
faster.
...
Something like:
if ((op == Py_EQ || op == Py_NE)
         &&  a->ob_shash != -1
         &&  b->ob_shash != -1
         &&  a->ob_shash != b->ob_shash) {
         /* strings are not equal */
     }
There are hash collision, so a->ob_shash == b->ob_shash doesn't mean
that the two strings are equal. But if the two hashs are different,
the two strings are different. Isn't it?
I would certainly hope so :)


-- 
Steven

Re: [Python-Dev] Why not using the hash when comparing strings?

Steven D'Aprano