"Eyal Lotem"
I believe that it is possible to add a useful feature to dicts, that will enable some interesting performance features in the future.
Dict lookups are in general supposed to be O(1) and they are indeed very fast. However, there are several drawbacks: A. O(1) is not always achieved, collisions may still occur
Note that O(1) is constant, not necessarily 1. Assuming that the hash function that Python uses is decent (it seems to work well), then as per the load factor of 2/3, then you get an expected number of probes = 1 + (2/3)^2 + (2/3)^3 + (2/3)^4 + ..., which sums to 3. Now, if you have contents that are specifically designed to screw the hash function, then you are going to get poor performance. But this is the case for any particular hash function; there exists inputs that force it to perform poorly. Given the above load factor, if 3 expected probes is too many, you can use d.update(d) to double the size of the dictionary, forcing the load factor to be 1/3 or less, for an expected number of probes = 1.5 .
B. Some lookups use static "literal" keys, and could benefit from accessing a dict item directly (very fast O(1) _worst_ case, without even a function call when done from the C side).
[snip] What you are proposing is to add a level of indirection between some pointer in the dictionary to some special PyDictItem object. This will slow Python execution when such a thing is used. Why? The extra level of indirection requires another pointer following, as well as a necessary check on the bitmap you are proposing, nevermind the additional memory overhead of having the item (generally doubling the size of dictionary objects that use such 'exported' items). You don't mention what algorithm/structure will allow for the accessing of your dictionary items in O(1) time, only that after you have this bitmap and dictioanry item thingy that it will be O(1) time (where dictionaries weren't already fast enough natively). I don't believe you have fully thought through this, but feel free to post C or Python source that describes your algorithm in detail to prove me wrong. You should note that Python's dictionary implementation has been tuned to work *quite well* for the object attribute/namespace case, and I would be quite surprised if anyone managed to improve upon Raymond's work (without writing platform-specific assembly). - Josiah