[Python-Dev] Non-string keys in namespace dicts

Neil Toronto ntoronto at cs.byu.edu
Sun Dec 2 21:49:35 CET 2007

Nicko van Someren wrote:
> On 2 Dec 2007, at 03:09, Neil Toronto wrote:
>> Are there any use-cases for allowing namespace dicts (such as globals,
>> builtins and classes) to have non-string keys? I'm asking because I'm
>> planning on accelerating method lookups next, and the possibility of a
>> key compare changing the underlying dict could be a major pain. (It was
>> a minor pain for globals.)
> The only plausible use case I can think of might be wanting to use ints 
> or longs as keys, though I've never seen it done.  Of course this would 
> be trivial to code around and it seems very much a fringe case, so I'd 
> be in favour of deprecating non-string namespace keys if it's going to 
> make look-ups go faster.

If you insert non-string keys into a namespace dict it'll slow down 
lookups already. :) The dict will switch to the more general lookdict 
from lookdict_string. Looks like it's just a bad idea all around.

It turned out not *that* hard to code around for attribute caching, and 
the extra cruft only gets invoked on a cache miss. The biggest problem 
isn't speed - it's that it's possible (though extremely unlikely), while 
testing keys for equality, that a rich compare alters the underlying 
dict. This causes the caching lookup to have to try to get an entry 
pointer again, which could invoke the rich compare, which might alter 
the underlying dict...

This problem already exists for general dicts with non-string keys (it 
can blow the C stack) and attribute caching makes it a bit more likely 
(the compare only has to insert or delete an item rather than cause a 
resize), so it'd be nice if it didn't apply to identifiers.

As far as I know, though, the only way to get non-string keys into a 
class dict is by using a metaclass.

Anyway, report: I've got an initial working attribute cache, using the 
conveniently-named-and-left-NULL tp_cache. It's a nice speedup - except 
on everything the standard benchmarks test, because their class 
hierarchies are very shallow. :p  If an MRO has more than two classes in 
it, every kind of lookup (class method, object method, object attribute) 
is faster. Having more than four or five makes things like self.variable 
take less than half the time.

It'd be nice to have a benchmark with a deep class hierarchy. Does 
anybody know of one?

I'm working on making it as fast as the original when the MRO is short. 
Question for Guido: should I roll this into the fastglobals patch?


More information about the Python-Dev mailing list