[Python-ideas] set.add(x) slower than if x in set:set.add(x)
tjreedy at udel.edu
Mon Sep 14 08:53:43 CEST 2009
Yuvgoog Greenle wrote:
> So this pattern is a valid python optimization?
That exactly not how I interpreted Raymond
> On Sun, Sep 13, 2009 at 10:37 PM, Raymond Hettinger
> <python at rcn.com
> <mailto:python at rcn.com>> wrote:
> On Sep 13, 2009, at 11:10 AM, Gerald Britton wrote:
> Here are some results:
> $ python -m timeit -n 1000000 -s 'with
> as f: s = set(w.strip("\n") for w in f)' 's.add("mother")'
> 1000000 loops, best of 3: 0.292 usec per loop
Try looking up and binding the method just once, any any experienced
Python programmer might do if doing repeated 'additions'.
-s'...: sadd=set(w.strip("\n") for w in f).add' 'sadd("mother")
> britton at TheBrittons:~$ python -m timeit -n 1000000 -s 'with
> open("/usr/share/dict/words") as f: s = set(w.strip("\n") for w
> in f)'
> 'if "mother" not in s:s.add("mother")'
> 1000000 loops, best of 3: 0.185 usec per loop
Add 'sadd = s.add' at end of setup, followed by
'if "mother" not in s: sadd("mother")
I doubt second will still be faster.
> the second example beats the first by about 36%
> Is the timing difference just the cost of the method lookup for
> or is something else happening that I'm not seeing?
> It is the something else you're not seeing ;-)
> On the first pass of the 1000000 loops, "mother" gets added.
> On the remaining passes the 'if "mother" not in set' test fails
> and the set.add() never gets executed. That latter operation
> is a bit more expensive than the contains-test because it
> includes the time to lookup and bind the add method.
My suggested alteration removes the repeated lookup and bind.
More information about the Python-ideas