[sapug] Large dictionaries

Daryl Tester Daryl.Tester at iocane.com.au
Fri May 12 01:06:26 CEST 2006


Chris Foote wrote:

>> Is this a startup operation, or are you expecting to insert that
>> many records that often?

> At startup, and on receiving a signal every few hours.

Yeah, can't trust those dictionaries to hold onto anything these
days.  ;-)  (I'm presuming the dictionary isn't being dynamically
updated).

>> I was going to suggest using cPickle to
>> load a precomputed dictionary, but a quick test shows the performance
>> is probably worse.

> You'd run out of RAM pretty quickly parsing it as well :-)

The parsing is pretty good in that regard - very little state is
required to reconstruct a dictionary.  But on subsequent rethink
it's going to suffer the same insert problems that you're
experiencing, so as crap solutions go, that idea is right up
there with 'em.

> Yes, that sounds like the way to go, but I can't believe that someone
> hasn't written some already.

I can see other hash table itches that people have scratched, but
not that one.  Looking at the 1.5.2 code for dictobject (it's the
only one I have conveniently unpacked) it would be straightforward
to add a resize method, possibly even to the constructor, but
then you're wind up with a non-standard Python.

Of course, all this assuming that it's the resize that's killing
your performance.  Remember the words of the Great Dilbert:

PHB: "Measure twice ... cut twice ..."
Wally: "And give the ruler a bad performance review?"



-- 
Regards,
  Daryl Tester, IOCANE Pty. Ltd.


More information about the sapug mailing list