External Hashing [was Re: matching strings in a large set of strings]
Tim Chase
python.list at tim.thechases.com
Fri Apr 30 15:45:17 EDT 2010
On 04/30/2010 12:51 PM, Helmut Jarausch wrote:
> I think one could apply an external hashing technique which would require only
> very few disk accesses per lookup.
> Unfortunately, I'm now aware of an implementation in Python.
> Does anybody know about a Python implementation of external hashing?
While you don't detail what you're hashing, Stephan Behnel
already suggested (in the parent thread) using one of Python's
native dbm modules (I just use anydbm and let it choose). The
underlying implementations should be fairly efficient assuming
you don't use the dumbdbm last-resort fallback). With the anydbm
interface, you can implement dict/set semantics as long as you
take care that everything is marshalled into and out of strings
for keys/values.
-tkc
More information about the Python-list
mailing list