[sapug] Large dictionaries
chris at inetd.com.au
Thu May 11 15:36:51 CEST 2006
On Thu, 11 May 2006, Michael Cohen wrote:
> Hi Chris,
> You probably need to balance the hash table. Normally a dict works by hashing
> the key into an array of a certain size. The array is indexed by the hash
> value and a linked list of values is attached to each slot in the array.
> If you load the table too much, there will be many hash collisions, which
> will cause the linked lists to get very long. This will slow down access time
> in an exponential fasion.
> Apart from not using a tuple for a dict key (im not sure how efficient the
> algorithm of getting a hash value from a python tuple struct is). I would
> recommend to split the single dict into a dict of dicts. This will have the
> effect to spreading the load on the hash table:
Good thinking, but alas, the overhead of dictionary creation for the
second value was too much:
dictionary metakit sqlite hash-in-hash
---------- ------- --------- ------------
1M numbers 8.8s 22.2s 89.6s 21.9s
2M numbers 24.0s 43.7s 190.0s 56.8s
5M numbers 115.3s 105.4s N/A > 185s
 pysqlite V1 & sqlite V3.
 I had to kill the process because it chewed up all 1Gig of RAM :-(
Thanks for the suggestion, but no go.
Chris Foote <chris at inetd.com.au>
Inetd Pty Ltd T/A HostExpress
Phone: (08) 8410 4566
More information about the sapug