expression in an if statement

Wed Aug 18 18:12:31 EDT 2010

On Wednesday 18 August 2010, it occurred to John Nagle to exclaim:
> On 8/18/2010 11:24 AM, ernest wrote:
> > Hi,
> > 
> > In this code:
> > 
> > if set(a).union(b) == set(a): pass
> > 
> > Does Python compute set(a) twice?
> 
>     CPython does.  Shed Skin might optimize.  Don't know
> about Iron Python.

I doubt any actual Python implementation optimizes this -- how could it? The 
object "set" is clearly being called twice, and it happens to be called with 
the object "a" as a sole argument twice. What if "set" has side effects? A 
compiler could only exclude this possibility if it knew exactly what "set" 
will be at run time, which it can't.

I expect that "set" and "a" have to be looked up twice, actually: 
"set(a).union(b)" might rebind either one of them. This would be considered a 
very rude and inappropriate thing to do, but Python usually guarantees to 
allow bad taste and behaviour.

I might be wrong on some points here, but this is what I expect the expression 
(set(a).union(b) == set(a)) has to do, in any conforming implementation of 
Python. Please correct me if I'm wrong.

  1. find out which object "set" refers to
  2. find out which object "a" refers to
  3. call (set) with the single positional argument (a), no keyword arguments
  4. get the attribute "union" of the return value of [3]
  5. find out which object "b" refers to
  6. call (.union) with the single positional argument (b).
  7. look up __eq__ in the __class__ of the return value of [6]
  8. find out which object "set" refers to
  9. find out which object "a" refers to
 10. call (set) with the single positional argument (a), no keyword arguments
 11. call [7] with two positional arguments: the return values [6] & [10]

I'm not 100% sure if there are any guarantees as to when (5) is taken care of 
-- what would happen if set(a) or even set(a).__getattr__ changed the global 
"b"?
My list there is obviously referring to Python 3.x, so there is no __cmp__ to 
worry about.