Speed revisited

Steven Bethard steven.bethard at gmail.com
Sat Jan 8 19:57:30 EST 2005


Bulba! wrote:
> Following advice of two posters here (thanks) I have written two
> versions of  the same program, and both of them work, but the
> difference in speed is drastic, about 6 seconds vs 190 seconds
> for about 15000 of processed records, taken from 2 lists of
> dictionaries.
> 
> I've read "Python Performance Tips" at
> 
> http://manatee.mojam.com/~skip/python/fastpython.html
> 
> ..but still don't understand why the difference is so big. 
> 
[snip]
> 
> # snippet 1, this runs in about 6 seconds
> !def prepend(l):
> !    map = {}
> !    for d in l:
> !        key = d['English']
> !        map[key] = d
> !    return map
> !
> !old_map = prepend(oldl)
> !new_map = prepend(newl)
> !
> !for engphrase in old_map:
> !     if engphrase in new_map:
> !         o = old_map[engphrase]
> !         n = new_map[engphrase]
> !         cm.writerow(matchpol(o,n))
> 
> 
> # snippet 2, this needs 190 seconds
> !while 1:
> !    if len(oldl) == 0 or len(newl) == 0:
> !        break
> !    if oldl[o]['English'] == newl[n]['English']:
> !        cm.writerow(matchpol(oldl[o], newl[n]))
> !        del oldl[o]
> !        del newl[n]
> !        o, n = 0, 0
> !        continue
> !    elif cmp(oldl[o]['English'], newl[n]['English']) < 0:
> !        if o == len(oldl):
> !            cm.writerow(newl[0])
> !            del(newl[0])
> !            o, n = 0, 0
> !            continue
> !        o+=1
> !    elif cmp(oldl[o]['English'], newl[n]['English']) > 0:
> !        if n == len(newl):
> !            cm.writerow(newl[0])
> !            del(oldl[0])
> !            o, n = 0, 0
> !            continue
> !        n+=1    

I believe you're running into the fact that deleting from anywhere but 
the end of a list in Python is O(n), where n is the number of items in 
the list.  Consider:

---------- test.py ----------
def delfromstart(lst):
     while lst:
         del lst[0]

def delfromend(lst):
     for i in range(len(lst)-1, -1, -1):
         del lst[i]
-----------------------------

[D:\Steve]$ python -m timeit -s "import test" 
"test.delfromstart(range(1000))"
1000 loops, best of 3: 1.09 msec per loop

[D:\Steve]$ python -m timeit -s "import test" "test.delfromend(range(1000))"
1000 loops, best of 3: 301 usec per loop

Note that Python lists are implemented basically as arrays, which means 
that deleting an item from anywhere but the end of the list is O(n) 
because all items in the list must be moved down to fill the hole.

Repeated deletes from a list are generally not the way to go, as your 
example shows. =)

Steve



More information about the Python-list mailing list