[Tutor] Hashing

Kalle Svensson kalle@gnupung.net
Sat, 11 Aug 2001 02:39:04 +0200


[Allan Crooks]
> Hi,
> 
> Does anyone know exactly how the hash method derives a hash 
> for tuples? Or strings?
> 
> >>> a, b, c, d = (1,), (2,), (3,), (4,)
> >>> map (hash, [a, b, c, d])
> [-1660579480, -1660579477, -1660579478, -1660579475]
> 
> The above code does seem slightly bewildering to me, anyone got 
> any ideas?

>From the Python source code:

static long
tuplehash(PyTupleObject *v)
{
    register long x, y;
    register int len = v->ob_size;
    register PyObject **p;
    x = 0x345678L;
    p = v->ob_item;
    while (--len >= 0) {
        y = PyObject_Hash(*p++);
	if (y == -1)
	    return -1;
	x = (1000003*x) ^ y;
    }
    x ^= v->ob_size;
    if (x == -1)
        x = -2;
    return x;
}

long
PyObject_Hash(PyObject *v)
{
    PyTypeObject *tp = v->ob_type;
    if (tp->tp_hash != NULL)
        return (*tp->tp_hash)(v);
    if (tp->tp_compare == NULL && RICHCOMPARE(tp) == NULL) {
        return _Py_HashPointer(v); /* Use address as hash value */
    }
    /* If there's a cmp but no hash defined, the object can't be hashed */
    PyErr_SetString(PyExc_TypeError, "unhashable type");
    return -1;
}

This basically means that a tuple hash is a function of the individual
objects' hashes.  Individual objects are hashed in different ways depending
on the tp_hash field of the type (tp_hash of PyTuple_Type is tuplehash).
Check out different objects and their hash functions in the Objects
subdirectory of the source code distribution.  Web access to current CVS
source code is at
http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/python/python/dist/src/

Peace,
  Kalle
-- 
[  kalle@gnupung.net  ][ Thought control, brought to you by the WIPO! ]
[ http://gnupung.net/ ][ http://anti-dmca.org/ http://eurorights.org/ ]