Persistent sets
Skip Montanaro
skip at pobox.com
Thu Jun 5 23:33:30 EDT 2003
Related to the trigraph thread, I thought it would be nice if the Set
output by the three-character n-gram generator was persistent because it
takes awhile to generate. I was pleasantly surprised to see how easy it is:
import sets
import shelve
class PersistentSet(sets.Set):
def __init__(self, file, iterable=None):
self._data = shelve.open(file)
if iterable is not None:
self._update(iterable)
I haven't tried it for anything other than the little generator, and while
obviously slower to generate than if I had used a regular set, it saves a
lot of startup time when you have a large mostly read-only set.
>>> s = psets.PersistentSet("trigraphs")
>>> "abc" in s
True
>>> "xzz" in s
False
>>> "aaa" in s
False
The above trigraph set contains 8106 elements and was generated from
/usr/share/dict/words, which on my current machine contains 234937 words.
Skip
More information about the Python-list
mailing list