don't need dictionary's keys - hash table?
Terry Hancock
hancock at anansispaceworks.com
Thu Jul 13 00:29:34 EDT 2006
Nick Vatamaniuc wrote:
> The original post just said that he wanted to index some values by
> their urls and didn't really care about the urls themselves, so I
> suggested that he just use the hash of the key as the key. The
> dictionary will then take a hash of that [note that:
> hash(string)=hash(hash(string)) ] then the dictionary will not keep
> the reference to the urls and GC will collect them. So instead of
> dic[urlstring']=value he will do dic[hash(urlstring)]=value. But then
> to retrieve he will have to do old_value=dic[hash(urlstring]].
Note that it is trivial to catch collisions on entry and correct them:
key = hash(url_string)
i = 0
if d.has_key(key):
while d.has_key(key):
i += 1
d[key + repr(i)] = val
else:
d[key] = val
or something along those lines. No need to keep the original string.
Obviously this is only efficient if collisions are rare.
I am a little surprised that hash(hash(s)) == hash(s), is that actually
true?
Cheers,
Terry
--
Terry Hancock (hancock at AnansiSpaceworks.com)
Anansi Spaceworks http://www.AnansiSpaceworks.com
More information about the Python-list
mailing list