very large dictionaries

William Park opengeometry at
Wed Jun 16 21:40:49 CEST 2004

robin <escalation746 at> wrote:
> I need to do a search through about 50 million records, each of which
> are less than 100 bytes wide. A database is actually too slow for
> this, so I thought of optimising the data and putting it all in
> memory.
> There is a single key field, so a dictionary is an obvious choice for
> a structure, since Python optimises these nicely.
> But is there a better choice? Is it worth building some sort of tree?

50M x 100 = 5000M = 5G.  You got 5Gig of memory?

Since you are talking about key/value record, you can choose from GDBM
(gdbm), Berkeley DB (dbhash), or disk-based dictionary front-end
(shelve).   You can now access GDBM database from Bash shell. :-)

William Park, Open Geometry Consulting, <opengeometry at>
No, I will not fix your computer!  I'll reformat your harddisk, though.

More information about the Python-list mailing list