[Python-Dev] Non-string keys in namespace dicts

Neil Toronto ntoronto at cs.byu.edu
Mon Dec 3 23:26:09 CET 2007

Phillip J. Eby wrote:
> At 12:27 PM 12/3/2007 -0700, Neil Toronto wrote:
>> Guido van Rossum wrote:
>> > How about subclasses of str? These have all the same issues...
>> Yeah. I ended up having it, per class, permanently revert to uncached
>> lookups when it detects that a class dict in the MRO has non-string
>> keys. That's flagged by lookdict_string, which uses PyString_CheckExact.
> I'm a bit confused here.  Isn't the simplest way to cache attribute 
> lookups to just have a cache dictionary in the type, and update that 
> dictionary whenever a change is made to a superclass?  That's 
> essentially how __slotted__ attribute changes on base classes work now, 
> isn't it?  Why do we need to mess around with the dictionary entries 
> themselves in order to do that?

The nice thing about caching pointers to dict entries is that they don't 
change as often as values do. There are fewer ways to invalidate an 
entry pointer: inserting set, resize, clear, and delete. If you cache 
values, non-inserting set could invalidate as well.

Because inserting into namespace dicts should be very rare, caching 
entries rather than values should reduce the number of times cache 
entries are invalidated to near zero. Updating is expensive, so that's 
good for performance.

Rare updating also means it's okay to invalidate the entire cache rather 
than single entries, so the footprint of the caching mechanism in the 
dict can be very small. For example, I've got a single 64-bit counter in 
each dict that gets incremented after every potentially invalidating 
operation. That comes down to 8 bytes of storage and two extra machine 
instructions (currently) per invalidating operation. The cache checks it 
against its own counter, and updating ensures that it's synced.

Some version of the non-string keys problem would exist with any caching 
mechanism, though. An evil rich compare can always monkey about with 
class dicts in the MRO. If a caching scheme caches values and doesn't 
account for that, it could return stale values. If it caches entries and 
doesn't account for that, it could segfault. I suppose you could argue 
that returning stale values is fitting punishment for using an evil rich 
compare, though the punishee isn't always the same person as the punisher.


More information about the Python-Dev mailing list