my small hashlib - using pythons hash-functions

Mathias Panzenboeck e0427417 at student.tuwien.ac.at
Sat Nov 25 17:17:09 EST 2006


Fredrik Lundh wrote:
> Mathias Panzenboeck wrote:
> 
>> But the question is: *IS* this derived work? I mean, it's not copied 
>> code.
>> It's the same hashing-logic, which I learned by watching pythons code.
> 
> given that it's only a few lines of code, and there's hardly any other 
> way to write those lines if you want to implement the same algorithm, 
> I'd say it falls under "fair use".  crediting the Python developers in 
> the source code should be good enough.
> 
> </F>
> 

ic. I'll write a mail to the python developers then.

The code in python looks like this (see code underneath), just if someone what's to know.
I only used the logic of the parts I understand. ;)

Objects/stringobject.c:

static long
string_hash(PyStringObject *a)
{
	register Py_ssize_t len;
	register unsigned char *p;
	register long x;

	if (a->ob_shash != -1)
		return a->ob_shash;
	len = a->ob_size;
	p = (unsigned char *) a->ob_sval;
	x = *p << 7;
	while (--len >= 0)
		x = (1000003*x) ^ *p++;
	x ^= a->ob_size;
	if (x == -1)
		x = -2;
	a->ob_shash = x;
	return x;
}

Objects/dictobject.c:

static dictentry *
lookdict_string(dictobject *mp, PyObject *key, register long hash)
{
	register size_t i;
	register size_t perturb;
	register dictentry *freeslot;
	register size_t mask = (size_t)mp->ma_mask;
	dictentry *ep0 = mp->ma_table;
	register dictentry *ep;

	/* Make sure this function doesn't have to handle non-string keys,
	   including subclasses of str; e.g., one reason to subclass
	   strings is to override __eq__, and for speed we don't cater to
	   that here. */
	if (!PyString_CheckExact(key)) {
#ifdef SHOW_CONVERSION_COUNTS
		++converted;
#endif
		mp->ma_lookup = lookdict;
		return lookdict(mp, key, hash);
	}
	i = hash & mask;
	ep = &ep0[i];
	if (ep->me_key == NULL || ep->me_key == key)
		return ep;
	if (ep->me_key == dummy)
		freeslot = ep;
	else {
		if (ep->me_hash == hash && _PyString_Eq(ep->me_key, key))
			return ep;
		freeslot = NULL;
	}

	/* In the loop, me_key == dummy is by far (factor of 100s) the
	   least likely outcome, so test for that last. */
	for (perturb = hash; ; perturb >>= PERTURB_SHIFT) {
		i = (i << 2) + i + perturb + 1;
		ep = &ep0[i & mask];
		if (ep->me_key == NULL)
			return freeslot == NULL ? ep : freeslot;
		if (ep->me_key == key
		    || (ep->me_hash == hash
		        && ep->me_key != dummy
			&& _PyString_Eq(ep->me_key, key)))
			return ep;
		if (ep->me_key == dummy && freeslot == NULL)
			freeslot = ep;
	}
}



More information about the Python-list mailing list