[Tutor] Performance difference, ``in'' vs ``has_key()''

Bill Campbell bill at celestial.net
Mon Jul 18 18:55:26 CEST 2005

On Sun, Jul 17, 2005, Danny Yoo wrote:
>> A related question is where's the trade-off between using ``in'' with a
>> list, and a dictionary?  I presume that using it with small hashes will
>> be faster than dictionries since it doesn't have to calculate the
>> hashes.
>Hi Bill,
>Scanning for an elements in a list is a "linear" operation, in the sense
>that the time it takes to search is proportional to the size of the list.
>(Big list == big search time.)

I just noticed that I said it backwards in my first post, ``using it with
small hashes'' should have been ``using it with small lists'' (and my perl
background leaks out referring to dictionaries as hashes :-).

>This doesn't mean that dictionaries are always faster than lists: as you
>know, calculating hash values can take time.  But the cost of hashing is
>usually negligible, since many of Python primitive data types (like
>strings) automatically cache their hash values too!

This would say that it's better to create the dictionary with string keys
rather than tuples, but that seems pretty obvious in any case.

The problem I'm working on involves going through a large list of invoices
that are now zero balance, to purge those before a certain date that have
no payment applications after that date.  I have a dictionary of zero-
balance invoices containing invoice objects, and each invoice object
contains a list of invoice keys applied to it.  This internal list may well
contain keys that refer to invoices that are either non-zero or have a date
after the cutoff date.

# begin code snippet
global invoices # dictionary of all zero balance invoices with date <= cutoff

deleted = True
while deleted:
    deleted = False
    keys = invoices.keys()
    for key in keys:
        # use try/except since the instance may be deleted
        try: invoice = invoices[keys]
        except KeyError: continue
        for appKey in invoice.appKeys:
            if not appKey in invoices:
                deleted = True
				del invoices[key] # this invoice can't be purged
                for appKey in invice.appKeys:
                    try: del invoices[appKey]
                    except KeyError: pass

# finish processing invoices

>A good book in algorithms will almost always cover the performance
>characteristics of those two strategies; there are also a lot of good
>resources on the web about them.  NIST has two articles on those two:
>    http://www.nist.gov/dads/HTML/linearSearch.html
>    http://www.nist.gov/dads/HTML/hashtab.html

Thanks for the references (occassionaly there's something that
government does that's actually useful :-).

INTERNET:   bill at Celestial.COM  Bill Campbell; Celestial Software LLC
UUCP:               camco!bill  PO Box 820; 6641 E. Mercer Way
FAX:            (206) 232-9186  Mercer Island, WA 98040-0820; (206) 236-1676
URL: http://www.celestial.com/

When the customer has beaten upon you long enough, give him what he asks
for, instead of what he needs.  This is very strong medicine, and is
normally only required once.
    -- The Consultant's Curse:

More information about the Tutor mailing list