[Python-ideas] set.add() return value
Brandon Mintern
bmintern at gmail.com
Tue Feb 17 09:37:06 CET 2009
On Tue, Feb 17, 2009 at 2:36 AM, Steven D'Aprano <steve at pearwood.info> wrote:
> if x in set:
> print "already there"
> else:
> set.add(x)
>
> versus:
>
> if set.add(x): # or another method name if you prefer
> print "already there"
These are truly different in terms of runtime performance, though. In
the first example, let's examine both cases. If x isn't in the set, we
hash x, perform a worst-case set lookup (because we have to scan the
entire list for whatever bucket that x hashes to), hash x again, and
then perform that same lookup again before adding x to the set. If x
is in the set, we perform one hash and one lookup.
In the second example, we always hash and lookup x exactly once,
adding it if necessary. This could turn out to be a non-trivial
difference in runtime if the set is large enough or x is an object
with a non-trivial hash. Sure the difference is a constant factor, but
it's a significant one, somewhere between 1.25-2 depending on the
percentage of "add"s that are unique items (if every item is unique,
we're looking at a 2x speedup for set operations).
But maybe that's just my premature optimization coming through,
Brandon
More information about the Python-ideas
mailing list