[Python-ideas] Fast global cacheless lookup
Neil Toronto
ntoronto at cs.byu.edu
Sun Nov 25 03:18:25 CET 2007
Jim Jewett wrote:
> On 11/22/07, Neil Toronto <ntoronto at cs.byu.edu> wrote:
>
>> ... What if a frame could maintain an
>> array of pointers right into a dictionary's entry table?
>
>> The dict notifies its observers on delitem, pop, popitem, resize and
>> clear. Nothing else is necessary - nothing else will change the address
>> of or invalidate an entry.
>
> I think this isn't quite true, because of DUMMY entries.
>
> Insert key1.
> Insert key2 that wants the same slot.
> Register an observer that cares about key2 but not key1.
>
> Delete key1. The key1 entry is replaced with DUMMY, but the entry
> for key2 is not affected.
>
> Look up key2 (by some other code which hasn't already taken this
> shortcut) and the lookdict function (as a side effect) moves key2 to
> the better location that key1 no longer occupies. As described, I
> think this breaks your cache.
Good grief old chap, you freaked me out.
Turns out it all still works. Whether the lookdict functions used to
move entries around I don't know, but now it doesn't. It's probably
because deletions are so rare compared to other operations that it's not
worth the extra logic in those tight little loops.
Mind if I keep rambling, just to make sure I've got it right? :)
It's the dummy entries that make lookup work at all. The lookdict
functions use them as flags so that it knows to keep skipping around the
table looking for an open entry or an entry with the right key. It's
basically: "If ep->me_key != key or ep->me_key == dummy, I need to keep
trying different ep's. If I reach an empty ep, return the first dummy I
found or that ep if I didn't find one. If I reach an ep with the right
key, return that."
I wasn't completely satisfied by static analysis, so I traced the case
you brought up through both lookdict and lookdict_string. Here it is:
Assume hash(key1) == hash(key2).
Assume (without loss of generality) that for this hash, entries are
traversed in order 0, 1, 2...
Insert key1:
0: key1
1: NULL
2: ...
Insert key2:
0: key1
1: key2
2: ...
Delete key1:
0: dummy
1: key2
2: ...
Look up key2 (trace):
("freeslot" keeps track of the first dummy found on the traversal; is
NULL if none found)
start:
ep = 0
freeslot = 0 [ep->me_key == dummy]
loop:
ep = 1
return ep [ep->me_key == key2]
In that last bit would have been the part that goes something like this:
if (freeslot != NULL) {
/* refcount-neutral */
*freeslot = *ep;
ep->me_key = dummy;
ep->me_value = NULL;
return freeslot;
}
else
return ep;
It might be a speed improvement if you assume that the key is very
likely to be looked up again. But it's extra complexity in a
speed-critical code path and you never know whether you lengthened the
traversal for other lookups.
As long as it's a wash in the end, it might as well be left alone, at
least for the fast globals. :D
Neil
More information about the Python-ideas
mailing list