Optimizing size of very large dictionaries
python at bdurham.com
python at bdurham.com
Wed Jul 30 20:29:39 EDT 2008
Are there any techniques I can use to strip a dictionary data
structure down to the smallest memory overhead possible?
I'm working on a project where my available RAM is limited to 2G
and I would like to use very large dictionaries vs. a traditional
database.
Background: I'm trying to identify duplicate records in very
large text based transaction logs. I'm detecting duplicate
records by creating a SHA1 checksum of each record and using this
checksum as a dictionary key. This works great except for several
files whose size is such that their associated checksum
dictionaries are too big for my workstation's 2G of RAM.
Thank you,
Malcolm
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-list/attachments/20080730/85596ceb/attachment.html>
More information about the Python-list
mailing list