[Python-Dev] object equality vs identity, in and dicts idioms and speed

Samuele Pedroni Samuele Pedroni" <pedroni@inf.ethz.ch
Fri, 4 Jan 2002 01:50:19 +0100


[Martin v. Loewis]
> > Now what is the fastest idiom equivalent to:
> > 
> > obj in list
> > 
> > when I'm interested in identity (is) and not equality?
> 
> It appears that doing a plain for loop is fastest, see the attached
> script below. On my system,it gives
> 
> m1         0   0.00      5000   0.29      9999   0.60       1.0   0.61
> m2         0   0.60      5000   0.61      9999   0.62       1.0   0.62
> m3         0   1.81      5000   1.81      9999   1.81       1.0   1.83
> m4         0   0.00      5000   1.54      9999   3.11       1.0   3.17

Thanks, and, sorry I could have done the measuraments myself
but I supposed that maybe someone should already know.
The result makes also sense, is the version that does less consing
and calling user python functions. But only profiling knows
the truth <wink>.

> 
> > Although my experience say that the equality case is the most
> > common, I wonder whether some directy support for the identity case
> > isn't worth, because it is rare but typically then you would like
> > some speed.
> 
> In Smalltalk, such things would be done in specialized
> containers. E.g. the IdentityDictionary is a dictionary where keys are
> considered equal only if identical. Likewise, you could have a
> specialized list type. OTOH, if you need speed, just write an
> extension module - doing a identical_in function is straight-forward.

Is not really my code, but yes writing an extension (especially in
jython) would be not too difficult but see below.
 
> I'd hesitate to add identical_in to the API, since it would mean that
> it needs to be supported for any container, the same sq_contains works
> now.

I see the problem. Implicitly I was asking whether adding  builtin-in
identity_list, identity_dict and corresponding weak versions for the dicts
could make sense or is just code bloat.

The context  is anygui (www.anygui.org), I'm following it closely
and I try to help with jython/swing issues.

Anygui has code like this in the event handling logic:

    source_stack.append(id(source))
    try:
      ...  
                      if not loop and not r.loop \
                           and id(obj) in source_stack: continue
     ...   
    finally:
        source_stack.pop()

Now this is a nice idiom and workarounds the identity-list problem,
but mmh ... id is broken under jython (that means
different objects can get the same id :( ) , also in 2.1 final.
 It is a long-standing bug
and yes we are about to solve it but there is a trade-off and jython
id will become precise but many times slower wrt to CPython version
(we need to implement a weak-key-table :( ).

An identity_list would make for a portable idiom with comparable
overhead and will give to the identity case somehow the same speed
of the equality case...

And further anygui shows also a possible need for a WeakKeyIdentityDict...

regards.