On 11/22/07, Neil Toronto
... What if a frame could maintain an array of pointers right into a dictionary's entry table?
The dict notifies its observers on delitem, pop, popitem, resize and clear. Nothing else is necessary - nothing else will change the address of or invalidate an entry.
I think this isn't quite true, because of DUMMY entries. Insert key1. Insert key2 that wants the same slot. Register an observer that cares about key2 but not key1. Delete key1. The key1 entry is replaced with DUMMY, but the entry for key2 is not affected. Look up key2 (by some other code which hasn't already taken this shortcut) and the lookdict function (as a side effect) moves key2 to the better location that key1 no longer occupies. As described, I think this breaks your cache. Of course you can get around this by just not moving things without a resize, but that is likely to be horrible for the (non-namespace?) dictionaries that do see frequent deletions. Another way around it is to also notify the observers whenever lookdict moves an entry; I'm not sure how that would affect normal lookup performance. A more radical change is to stop exposing the internal structure at all. For example, a typical namespace might instead be represented as an array of values, plus a dict mapping names to indices. The cost would be an extra pointer for each key ever in the dictionary (since you wouldn't reuse positional slots), and the savings would be that most lookups could just grab namespace[i] without having to even check that they got the right key, let alone following a trail of collision resolutions.
To speed up globals access, an auxiliary object to functions and frames registers itself as an observer to func_globals and __builtins__.
Note that func_globals probably *will* be updated again in the future, if only to register this very function with its module. You could wait to "seal" a namespace until you think all its names are known, or you could adapt the timestamp solution suggested in http://bugs.python.org/issue1616125 -jJ