Speeding dictionary lookup of instances

Andrew Dalke dalke at dalkescientific.com
Fri Dec 21 19:32:30 CET 2001

Brian Kelley:
>This speeds up dictionary lookup by a factor of 20.  Not that
>implementing a __hash__ method to return the handle won't speed up the

Are you saying the same thing as method calls are expensive
in Python?  Or that attribute lookups are expensive?

>It turns out this the speed of this is identical to the speed of using
>the handle.  I like using the handle because the dictionary is picklable
>  with the instance.  id(f) changes during different sessions.

You also need to pickle the id generator state as otherwise
you may have duplicate values created during different sessions.

>So if you want faster lookups using instances as keys, use a handle or
>the id.
>Attached is some simple timing code showing these three methods.

I ran it on a Linux/Alpha box with python 2.2b2+.  I'm not
getting a factor of 20 performance difference.

[dalke at pw600a ~/src]$ python spam.py
time for lookups of instances 0.0959539413452
time for lookups of integers 0.0619969367981
time for lookups of ids 0.0673110485077
[dalke at pw600a ~/src]$

I also tried this class, which reduces one level of lookup.

class FastHandledNode:
    def __init__(self, data, HandleGenerator=IndexGenerator()):
        self.data = data
        handle = HandleGenerator.next()
        def _hash(handle = handle):
            return handle
        self.__hash__ = _hash

    def __repr__(self):
        return "Node(%s)"%`self.data`

That gives me

time for lookups of instances 0.0949679613113
time for lookups of integers 0.0595859289169
time for lookups of ids 0.0635080337524
time for fast lookups 0.0655270814896

so it looks like __hash__ lookup in class namespace is
taking up a good chunk of time.

By the way, I also tried this on Python 1.5.2 (needed to
replace a += of yours)

time for lookups of instances 0.0599880218506
time for lookups of integers 0.0540360212326
time for lookups of ids 0.0583939552307
time for fast lookups 0.0580099821091

Looks like Python is now slower for the code you use.

                    dalke at dalkescientific.com

More information about the Python-list mailing list