[Python-Dev] object equality vs identity, in and dicts idioms and speed
Martin v. Loewis
martin@v.loewis.de
Thu, 3 Jan 2002 23:56:38 +0100
> Now what is the fastest idiom equivalent to:
>
> obj in list
>
> when I'm interested in identity (is) and not equality?
It appears that doing a plain for loop is fastest, see the attached
script below. On my system,it gives
m1 0 0.00 5000 0.29 9999 0.60 1.0 0.61
m2 0 0.60 5000 0.61 9999 0.62 1.0 0.62
m3 0 1.81 5000 1.81 9999 1.81 1.0 1.83
m4 0 0.00 5000 1.54 9999 3.11 1.0 3.17
> Although my experience say that the equality case is the most
> common, I wonder whether some directy support for the identity case
> isn't worth, because it is rare but typically then you would like
> some speed.
In Smalltalk, such things would be done in specialized
containers. E.g. the IdentityDictionary is a dictionary where keys are
considered equal only if identical. Likewise, you could have a
specialized list type. OTOH, if you need speed, just write an
extension module - doing a identical_in function is straight-forward.
I'd hesitate to add identical_in to the API, since it would mean that
it needs to be supported for any container, the same sq_contains works
now.
Regards,
Martin
import time
x = range(10000)
rep = [None] * 100
values = x[0], x[5000], x[-1], 1.0
def m1(val, rep=rep, x=x):
for r in rep:
found = 0
for s in x:
if s is val:
found = 1
break
def m2(val, rep=rep, x=x):
for r in rep:
found = [s for s in x if s is val]
def m3(val, rep=rep, x=x):
for r in rep:
def identical(elem):
return elem is val
found = filter(identical, x)
class Contains:
def __init__(self, val):
self.val = val
def __eq__(self, other):
return self.val is other
def m4(val, rep=rep, x=x):
for r in rep:
found = Contains(val) in x
for options in [m1, m2, m3, m4]:
print options.__name__,
for val in values:
start = time.time()
options(val)
end = time.time()
print "%9s %6.2f" % (val,end-start),
print