[Python-ideas] dict changes [was: Ordered storage of keyword arguments]

Jim Jewett jimjjewett at gmail.com
Thu Oct 28 20:44:59 CEST 2010


On 10/28/10, Antoine Pitrou <solipsis at pitrou.net> wrote:
> On Thu, 28 Oct 2010 19:58:59 +0200
> spir <denis.spir at gmail.com> wrote:

>> What does the current implementation use as buckets?

> It uses an open addressing strategy. Each dict entry holds three
> pointer-sized fields: key object, value object, and cached hash value
> of the key.
> (set entries have only two fields, since they don't hold a value object)

Has anyone benchmarked not storing the hash value here?

For a string dict, that hash should already be available on the string
object itself, so it is redundant.  Keeping it obviously improves
cache locality, but ... it also makes the dict objects 50% larger, and
there is a chance that the strings themselves would already be in
cache anyhow.  And if strings were reliably interned, the comparison
check should normally just be a pointer compare -- possibly fast
enough that the "different hash" shortcut doesn't buy anything.
[caveats about still needing to go to the slower dict implementation
for string subclasses]

-jJ



More information about the Python-ideas mailing list