[Python-ideas] set.add(x) slower than if x in set:set.add(x)
ncoghlan at gmail.com
Mon Sep 14 15:20:27 CEST 2009
Yuvgoog Greenle wrote:
> So this pattern is a valid python optimization? Funky...
> Sadly, there's no way around it unless the interpreter somehow did it
> magically for you.
The interpreter has no way of knowing a priori that the branch won't be
taken 999,999 times out of 1,000,000.
Think about what you are actually comparing here:
Average speed of 1 million calls to s.add(x)
Average speed of 1 million calls to "x in s" + one call to s.add(x)
Now think about the fact that s.add(x) includes a containment test plus
function call and name lookup overhead even in the case that the item is
already present in the set.
All overhead included:
$ python -m timeit -s "s = set()" "s.add(1)"
1000000 loops, best of 3: 0.197 usec per loop
Lose the attribute lookup:
$ python -m timeit -s "s = set()" -s "sadd = s.add" "sadd(1)"
10000000 loops, best of 3: 0.146 usec per loop
Skip the function call altogether most of the time:
$ python -m timeit -s "s = set()" "if 1 not in s: s.add(1)"
10000000 loops, best of 3: 0.101 usec per loop
Just do the containment test:
$ python -m timeit -s "s = set()" "1 not in s"
10000000 loops, best of 3: 0.1 usec per loop
Now then, lets also look at the absolute numbers we're discussing here
(on my machine, anyway). Is the fastest version twice as fast as the
slowest version? Yes it is. But that difference is only 97
*nano*seconds. And relative to the recommended approach of caching the
attribute looking, we're only saving 45 nanoseconds.
And in more realistic use cases where some items are already in the set
and some aren't, the "I'll check first" implementation can become a
pessimisation instead of an optimisation. To emphasise the point, we'll
go to the other extreme where the item is added to the set every time:
All the overhead:
$ python -m timeit -s "s = set()" "s.add(1)" "s.clear()"
1000000 loops, best of 3: 0.374 usec per loop
The "optimised" approach:
$ python -m timeit -s "s = set()" "if 1 not in s: s.add(1)" "s.clear()"
1000000 loops, best of 3: 0.444 usec per loop
Oops, looks like the approach that saves us 45 nanoseconds when the item
is already in the set may cost us up to *70* nanoseconds when it turns
out we need to add the item after all.
Caching attribute lookups before time critical loops is a good
optimisation technique that most experienced Python programmers learn.
Pre-checking a condition that a called function is just going to check
again and bail out quickly in less than 50% of cases? Usually a bad idea
- the extra checks made when the function is invoked anyway will often
cancel out any gains you make from avoiding the function call overhead.
That said, as with any micro-optimisation: time it on a range of typical
and end-case data sets. If you avoid the function call overhead often
enough, doing a pre-check may be a net win for some inner loops.
Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia
More information about the Python-ideas