dict vs kjBuckets vs ???
tim_one at email.msn.com
Thu Jun 10 22:23:15 EDT 1999
> In some book on algorithms I've read that after inserting limited
> number of items performance of operating on hash tables
> drops dramatically.
Depends on the details. What you read is true of some kinds of hash tables.
Python's dicts dynamically expand to keep the "load factor" under 2/3, so
what you read isn't applicable to Python in normal use.
> I plan to write a program that would store lots (in range of 10M or even
> more) of relatively small objects (a few hundred bytes at most), so what
> do you think I should use?
Let's do a little math <wink>: 10M * 100 = ?, a lower bound on what you're
contemplating. Do you have gigabytes of RAM?
> I thought about dictionaries, kjBuckets, or maybe even library called
> Metakit for Python (http://www.equi4.com/metakit/info/README-Python.html).
> what-do-you-think-ly y'rs
You don't really want to know <wink>. Memory-based data structures aren't
going to work for the size of thing you have in mind. If you can make it
fly it all, you'll likely require a powerful database, so of those choices
Metakit is the only approach that's not dead on arrival.
better-still-write-it-in-perl<wink>-ly y'rs - tim
More information about the Python-list