Map lots of words to lots of integers
bjorn at roguewave.com
Thu May 4 13:58:33 EDT 2000
The shelve module should be most useful to you:
db = shelve.open('foobar.db')
db['key'] = [1,2,3]
for k in db.keys():
print k, db[k]
if your lists get _really_ long, just add a layer of indirection
(something along the lines of):
db['word'] = ['@word at 1', '@word at 2'] # list of keys to sublists
db['@word at 1'] = [1,2,3] # first sublist
db['@word at 2'] = [4,5,6] # second sublist
> I need a fast way of mapping words to integers. A single word must be
> able to point to many, *many*, integers. Tried stuff like a dict,
> words as keys, pointing to a list of integers. This is all fine and
> nice if the thing is located in memory. I want to (or need ) to store
> all of this on disk. And the method must be fast. Thought I could use
> a Berkley DB file using words as keys, but what should they point to?
> The number of words can of course be thousands and the integers they
> point to even more. Does Zopes internals like ZODB etc. offer anything
> I could use?
> What I`ve tried so far is to make a general indexing-module, where you
> do something like
> x = Indexer('data_file.db')
> # extract words from documents etc.
> x.add(word2index, id)
> etc. etc.
> print x.locate('python')
> [432,6363,326,65464,6544,456465465,65433,76] # of course this would be
> # HUGE and may not fit into a list
> What I`d really need is to store several integers as one key/id, ex.
> as a tuple, but I`ll settle for less if somebody just could give me
> some pointers.
> NOTE! The number of words are as many as there are eh ... words, and
> integers, well, how far can a human count?
More information about the Python-list