dictionary.update() enhancement?

Mark J maj64 at hotmail.com
Fri Oct 12 19:06:07 EDT 2001


Mark J wrote:
>> Because it's not as straight-forward as saying d.update(other,
>> collision_handler) and it's about 10 times slower.

Erik Max Francis <max at alcyone.com> wrote:
> Ten times slower?  What in the world gives you that idea?

Try it:

import time

dictsize=1000
numloops=10000

d={}
other={}
for i in range(dictsize):  #populate the dicts
    d[i]=None
    other[i]=None

def withupdate(other):
    for i in range(numloops):
        d.update(other)

def withloop(other):
    for i in range(numloops):
        for k, v in other.items():
            if d.has_key(k):
                #this "collision function" will merely duplicate
                #update's current behavior;
                #i.e. overwrite the existing value.
                d[k]=v
            else: #new key
                d[k]=v

print "Dictionary size: %s  Number of loops: %s" % (dictsize,
numloops)
start = time.clock()
withupdate(other)
finish = time.clock()
print "with update:  %5.2fs"%(finish-start)
start = time.clock()
withloop(other)
finish = time.clock()
print "with loop:    %5.2fs"%(finish-start)

******  Results of run on RedHat 7.1, Dell OptiPlex GX150, 512MB RAM
Dictionary size: 1000  Number of loops: 10000
with update:   0.81s
with loop:    19.36s
>>> 19.36/0.81
23.901234567901231

Ignoring the other benefits to an unhanced update function, that's a
big speedup over hand-iteration.  Different test runs using different
parameters were roughly the same.

The above code merely preserves the semantics of the existing update
function since that's the fairest test until dictionary.update() is
changed.  (Note:  removing the conditional in withloop() still keeps
it about 10x slower.)

Of course, the performance difference will decrease as the complexity
of the collision function increases, but many useful collision
functions are about as simple as or even simpler than the one
currently used by dictionary.update().

By claiming only 10x faster in my last message, I was trying to
(reasonably and fairly?) account for the extra overhead of running the
collision function within update().

Please correct me if I'm missing something.  I'd also be interesting
in hearing any big discrepancies on the performance factor on
different platforms.

Mark



More information about the Python-list mailing list