[Python-ideas] set.add(x) slower than if x in set:set.add(x)

Gerald Britton gerald.britton at gmail.com
Sun Sep 13 20:10:45 CEST 2009


Hi -- This is maybe the wrong list for this question.  If would
someone please redirect me?

I stumbled across a performance anomaly wrt the set.add method.  My
idea was that if I try to add something via set.add, the method has to
first check if the new item is already in the set, since set items are
supposed to be unique.  Then, on a whim, I stuck an "if x in set"
condition in front of it.  I was surprised to learn that this latter
approach runs faster!  Here are some results:

$ python -m timeit -n 1000000  -s 'with open("/usr/share/dict/words")
as f: s = set(w.strip("\n") for w in f)'  's.add("mother")'
1000000 loops, best of 3: 0.292 usec per loop

britton at TheBrittons:~$ python -m timeit -n 1000000  -s 'with
open("/usr/share/dict/words") as f: s = set(w.strip("\n") for w in f)'
 'if "mother" not in s:s.add("mother")'
1000000 loops, best of 3: 0.185 usec per loop

the second example beats the first by about 36%

Is the timing difference just the cost of the method lookup for s.add,
or is something else happening that I'm not seeing?

-- 
Gerald Britton



More information about the Python-ideas mailing list