Now what is the fastest idiom equivalent to:
obj in list
when I'm interested in identity (is) and not equality?
It appears that doing a plain for loop is fastest, see the attached script below. On my system,it gives
m1 0 0.00 5000 0.29 9999 0.60 1.0 0.61 m2 0 0.60 5000 0.61 9999 0.62 1.0 0.62 m3 0 1.81 5000 1.81 9999 1.81 1.0 1.83 m4 0 0.00 5000 1.54 9999 3.11 1.0 3.17
Although my experience say that the equality case is the most common, I wonder whether some directy support for the identity case isn't worth, because it is rare but typically then you would like some speed.
In Smalltalk, such things would be done in specialized containers. E.g. the IdentityDictionary is a dictionary where keys are considered equal only if identical. Likewise, you could have a specialized list type. OTOH, if you need speed, just write an extension module - doing a identical_in function is straight-forward.
I'd hesitate to add identical_in to the API, since it would mean that it needs to be supported for any container, the same sq_contains works now.
x = range(10000) rep = [None] * 100
values = x, x, x[-1], 1.0
def m1(val, rep=rep, x=x): for r in rep: found = 0 for s in x: if s is val: found = 1 break
def m2(val, rep=rep, x=x): for r in rep: found = [s for s in x if s is val]
def m3(val, rep=rep, x=x): for r in rep: def identical(elem): return elem is val found = filter(identical, x)
class Contains: def __init__(self, val): self.val = val def __eq__(self, other): return self.val is other
def m4(val, rep=rep, x=x): for r in rep: found = Contains(val) in x
for options in [m1, m2, m3, m4]: print options.__name__, for val in values: start = time.time() options(val) end = time.time() print "%9s %6.2f" % (val,end-start), print