[Python-Dev] Non-string keys in namespace dicts
Neil Toronto
ntoronto at cs.byu.edu
Sun Dec 2 21:49:35 CET 2007
Nicko van Someren wrote:
> On 2 Dec 2007, at 03:09, Neil Toronto wrote:
>
>> Are there any use-cases for allowing namespace dicts (such as globals,
>> builtins and classes) to have non-string keys? I'm asking because I'm
>> planning on accelerating method lookups next, and the possibility of a
>> key compare changing the underlying dict could be a major pain. (It was
>> a minor pain for globals.)
>
> The only plausible use case I can think of might be wanting to use ints
> or longs as keys, though I've never seen it done. Of course this would
> be trivial to code around and it seems very much a fringe case, so I'd
> be in favour of deprecating non-string namespace keys if it's going to
> make look-ups go faster.
If you insert non-string keys into a namespace dict it'll slow down
lookups already. :) The dict will switch to the more general lookdict
from lookdict_string. Looks like it's just a bad idea all around.
It turned out not *that* hard to code around for attribute caching, and
the extra cruft only gets invoked on a cache miss. The biggest problem
isn't speed - it's that it's possible (though extremely unlikely), while
testing keys for equality, that a rich compare alters the underlying
dict. This causes the caching lookup to have to try to get an entry
pointer again, which could invoke the rich compare, which might alter
the underlying dict...
This problem already exists for general dicts with non-string keys (it
can blow the C stack) and attribute caching makes it a bit more likely
(the compare only has to insert or delete an item rather than cause a
resize), so it'd be nice if it didn't apply to identifiers.
As far as I know, though, the only way to get non-string keys into a
class dict is by using a metaclass.
Anyway, report: I've got an initial working attribute cache, using the
conveniently-named-and-left-NULL tp_cache. It's a nice speedup - except
on everything the standard benchmarks test, because their class
hierarchies are very shallow. :p If an MRO has more than two classes in
it, every kind of lookup (class method, object method, object attribute)
is faster. Having more than four or five makes things like self.variable
take less than half the time.
It'd be nice to have a benchmark with a deep class hierarchy. Does
anybody know of one?
I'm working on making it as fast as the original when the MRO is short.
Question for Guido: should I roll this into the fastglobals patch?
Neil
More information about the Python-Dev
mailing list