[Python-Dev] Non-string keys in namespace dicts
ntoronto at cs.byu.edu
Mon Dec 3 23:26:09 CET 2007
Phillip J. Eby wrote:
> At 12:27 PM 12/3/2007 -0700, Neil Toronto wrote:
>> Guido van Rossum wrote:
>> > How about subclasses of str? These have all the same issues...
>> Yeah. I ended up having it, per class, permanently revert to uncached
>> lookups when it detects that a class dict in the MRO has non-string
>> keys. That's flagged by lookdict_string, which uses PyString_CheckExact.
> I'm a bit confused here. Isn't the simplest way to cache attribute
> lookups to just have a cache dictionary in the type, and update that
> dictionary whenever a change is made to a superclass? That's
> essentially how __slotted__ attribute changes on base classes work now,
> isn't it? Why do we need to mess around with the dictionary entries
> themselves in order to do that?
The nice thing about caching pointers to dict entries is that they don't
change as often as values do. There are fewer ways to invalidate an
entry pointer: inserting set, resize, clear, and delete. If you cache
values, non-inserting set could invalidate as well.
Because inserting into namespace dicts should be very rare, caching
entries rather than values should reduce the number of times cache
entries are invalidated to near zero. Updating is expensive, so that's
good for performance.
Rare updating also means it's okay to invalidate the entire cache rather
than single entries, so the footprint of the caching mechanism in the
dict can be very small. For example, I've got a single 64-bit counter in
each dict that gets incremented after every potentially invalidating
operation. That comes down to 8 bytes of storage and two extra machine
instructions (currently) per invalidating operation. The cache checks it
against its own counter, and updating ensures that it's synced.
Some version of the non-string keys problem would exist with any caching
mechanism, though. An evil rich compare can always monkey about with
class dicts in the MRO. If a caching scheme caches values and doesn't
account for that, it could return stale values. If it caches entries and
doesn't account for that, it could segfault. I suppose you could argue
that returning stale values is fitting punishment for using an evil rich
compare, though the punishee isn't always the same person as the punisher.
More information about the Python-Dev