Caching Strategies/Database question

Ype Kingma ykingma at accessforall.nl
Thu Apr 26 16:56:24 EDT 2001


Brian Kelley wrote:
> 
> I am using a dictionary to hold
> 
> {string:[list of integers]}
> 
> key: values
> 
> The dictionary is randomly accessed to add an integer to the list
> referenced by the string.
> 
> I have been exploring several differnent caching strategies ranging from
> bsddb3 to the metakit toolkit for caching/retrieving these to disk.  One
> of the problems that I am facing is that because the dictionary is
> randomly accessed after the database gets to a certain size performance
> goes WAY down with every strategy that I have tried.  (Note that
> integers are never removed if that helps)  I have had success speed-wise
> with MySQL but I simply cannot deliver that in my application, it's way
> too costly to administer.
>

Adding RAM will help, but probably only temporarily.
Since you're not deleting anything you might change the physical record
format to:
(string, nextInteger)
and simply append these records, ie. using a heap like structure.
Eventually replace the string by a smaller key and keep a separate
table of (string, smaller key).

Needless to say that although this might solve your integer adding
problem, retrieval is now not as easy as it was.
To solve that you might add backpointers and a pointer to the
last added nextInteger record in (string, smaller key) table.

No free lunch, sorry. Locality of reference is the only
real solution. You might try to build up in RAM, then
sort, then write to disk.

> Has anyone examined a similar problem?  Thanks in advance.
> 
> Brian Kelley

Good luck,
Ype
-- 
email at xs4all.nl



More information about the Python-list mailing list