Berkely Db. How to iterate over large number of keys "quickly"

lazy arunmail at gmail.com
Thu Aug 2 21:43:58 CEST 2007


I have a berkely db and Im using the bsddb module to access it. The Db
is quite huge (anywhere from 2-30GB). I want to iterate over the keys
serially.
I tried using something basic like

for key in db.keys()

but this takes lot of time. I guess Python is trying to get the list
of all keys first and probbaly keep it in memory. Is there a way to
avoid this, since I just want to access keys serially. I mean is there
a way I can tell Python to not load all keys, but try to access it as
the loop progresses(like in a linked list). I could find any accessor
methonds on bsddb to this with my initial search.
I am guessing BTree might be a good choice here, but since while the
Dbs were written it was opened using hashopen, Im not able to use
btopen when I want to iterate over the db.




More information about the Python-list mailing list