[Python-Dev] Python 3000: Special type for object attributes & map keys
Neil Toronto
ntoronto at cs.byu.edu
Wed Mar 19 19:54:49 CET 2008
Greg Ewing wrote:
> Neal Norwitz wrote:
>> Part of this is done, but very differently in that all strings used in
>> code objects are interned (stored in a dictionary
>
> And since two interned strings can be compared
> by pointer identity, I don't see how this differs
> significantly from the "unique integer" idea.
>
> If the integers were used to directly index an
> array instead of being used as dict keys, it
> might make a difference. The cost would be that
> every namespace would need to be as big as
> the number of names in existence, with most
> of them being extremely sparse.
And we already have a data structure for sparse arrays... it's called a
"dict". :)
If every attribute name were guaranteed to be an interned string (not
currently the case - attribute names can be any kind of object), it
might be possible to add another dict specialization for interned string
lookup. The wins would be that lookup could assume the string's hash is
valid, and equality comparison could be done via pointer comparison.
HOWEVER...
I think that the attribute cache patches that went into 2.6 and 3.0
mostly take care of lookup speed issues. They both assume strings are
interned already. A cache hit consists of calculating a cache index from
the string hash, ensuring that the attribute name at that index is
identical (via pointer comparison) to the attribute name to be looked
up, and returning the associated value. Lookup with an attribute that's
not a string or not interned is automatically a cache miss, and it
happens very rarely.
Specializing dicts for interned strings would optimize the cache miss.
(When I was making the 3.0 patch, I found that happened rarely on the
benchmarks and regression tests. It was somewhere around 5%.) The cache
miss isn't expensive just because of the dict lookup. The attribute has
to be searched for in every type and super-type of the object. The
occasional string equality test probably doesn't register.
I'd be happy to be shown to be wrong, though.
Neil
More information about the Python-Dev
mailing list