Integer and String Dictionary Keys for Fast Access

Jeff Epler jepler at unpythonic.net
Thu Oct 3 20:12:38 EDT 2002


On Thursday 03 October 2002 16:18, Lance wrote:
> > The advantage is that key lookups by integer are no doubt much faster than
> > by strings.

On Thu, Oct 03, 2002 at 04:38:42PM -0700, Sean 'Shaleh' Perry wrote:
> test it out in python, you may be surprised.  I don't think the type really 
> matters for dictionary look ups.

Timings:
0.286319971085	# d[1]
0.23636496067	# d[1000]
0.231885075569	# d['a']
0.233772993088	# d['a' * 10000]
0.216398954391	# l[1]
0.232298016548  # l[1000]

so not only is it about the same speed to get a string or int key from a
dict, it's about the same speed to get an int index from a list.  This is
independant of the length of the key.

of course, within functions, locals are accessed by simple indexing of a C
array with C ints which should be a bit faster still.

It's possible that in the future this could be extended to methods of
new-style classes which use __slots__ for instance variables.  When no
assignment to the first local is seen, then LOAD_FAST self / LOAD_ATTR a
could be optimized into LOAD_SLOT a.  Besides needing whole-function
analysis (but isn't this already provided to know eg that exec is not
called?) the problem here is that the slot number for the attribute 'a'
depends on the class, but you are free to write
    class C:
	pass
    def f(self, ...): pass
    C.f = f
There are probably other wrinkles that I've not considered that make this
harder than it sounds...

Jeff


import time

def test(d, k):
    for i in range(100000):
	d[k]

keys = (1, 1000, 'a', 'a' * 10000)
d = dict([(k, None) for k in keys])

for key in keys:
    t0 = time.time()
    test(d, key)
    t1 = time.time()

    print t1-t0

l = [None] * 1001

for key in (1, 1000):
    t0 = time.time()
    test(l, key)
    t1 = time.time()

    print t1-t0




More information about the Python-list mailing list