[Python-ideas] Fast global cacheless lookup

Neil Toronto ntoronto at cs.byu.edu
Sun Nov 25 03:18:25 CET 2007

Jim Jewett wrote:
> On 11/22/07, Neil Toronto <ntoronto at cs.byu.edu> wrote:
>> ... What if a frame could maintain an
>> array of pointers right into a dictionary's entry table?
>> The dict notifies its observers on delitem, pop, popitem, resize and
>> clear. Nothing else is necessary - nothing else will change the address
>> of or invalidate an entry.
> I think this isn't quite true, because of DUMMY entries.
> Insert key1.
> Insert key2 that wants the same slot.
> Register an observer that cares about key2 but not key1.
> Delete key1.   The key1 entry is replaced with DUMMY, but the entry
> for key2 is not affected.
> Look up key2 (by some other code which hasn't already taken this
> shortcut) and the lookdict function (as a side effect) moves key2 to
> the better location that key1 no longer occupies.  As described, I
> think this breaks your cache.

Good grief old chap, you freaked me out.

Turns out it all still works. Whether the lookdict functions used to 
move entries around I don't know, but now it doesn't. It's probably 
because deletions are so rare compared to other operations that it's not 
worth the extra logic in those tight little loops.

Mind if I keep rambling, just to make sure I've got it right? :)

It's the dummy entries that make lookup work at all. The lookdict 
functions use them as flags so that it knows to keep skipping around the 
table looking for an open entry or an entry with the right key. It's 
basically: "If ep->me_key != key or ep->me_key == dummy, I need to keep 
trying different ep's. If I reach an empty ep, return the first dummy I 
found or that ep if I didn't find one. If I reach an ep with the right 
key, return that."

I wasn't completely satisfied by static analysis, so I traced the case 
you brought up through both lookdict and lookdict_string. Here it is:

Assume hash(key1) == hash(key2).
Assume (without loss of generality) that for this hash, entries are 
traversed in order 0, 1, 2...

Insert key1:

0: key1
2: ...

Insert key2:

0: key1
1: key2
2: ...

Delete key1:

0: dummy
1: key2
2: ...

Look up key2 (trace):

("freeslot" keeps track of the first dummy found on the traversal; is 
NULL if none found)

   ep = 0
   freeslot = 0  [ep->me_key == dummy]
   ep = 1
   return ep     [ep->me_key == key2]

In that last bit would have been the part that goes something like this:

if (freeslot != NULL) {
     /* refcount-neutral */
     *freeslot = *ep;
     ep->me_key = dummy;
     ep->me_value = NULL;
     return freeslot;
     return ep;

It might be a speed improvement if you assume that the key is very 
likely to be looked up again. But it's extra complexity in a 
speed-critical code path and you never know whether you lengthened the 
traversal for other lookups.

As long as it's a wash in the end, it might as well be left alone, at 
least for the fast globals. :D


More information about the Python-ideas mailing list