[Python-ideas] Object grabbing

Sun May 1 21:29:16 EDT 2016

On Mon, May 02, 2016 at 11:56:00AM +1200, Greg Ewing wrote:

> Suggestions like this have been made before, and the conclusion
> has always been that very little would be gained over using
> a short intermediate name, e.g.
> 
> m = myobject
> m.a()
> m.b()
> m.c()
> m.d = 1
> 
> That's just as readable and almost as easy to type. It also
> has the advantage that you're not restricted to just one
> "easy acess" object at a time.

Multiple short, simple statements like the above are not the best 
demonstration of the "one letter variable name" technique, nor of the 
advantage of the "Pascal with" (using) keyword. We should be thinking of 
larger expressions, with multiple references to the same name:

print(myobject.method(myobject.eggs + (myobject.spam or ham)) 
      - myobject.tomato - cheese)

which becomes either:

m = myobject
print(m.method(m.eggs + (m.spam or ham)) - m.tomato - cheese)
del m

versus:

using myobject:
    print(.method(.eggs + (.spam or ham)) - .tomato - cheese)

I don't think there's a lot between them, but if the second can be 
significantly more efficient, that might push it over the line.

Why do I `del m` at the end? Because this might be in the global scope, 
not inside a function, and you might not want an extraneous variable 
floating around polluting the namespace.

> The only new thing in your proposal is the change to the
> bytecode, and that could be achieved by treating it as
> an optimisation. A sufficiently smart code generator
> could notice that you were repeatedly operating on the same
> object and produce the bytecode you suggest.

No, I don't think it can be. Or rather, the semantics are not the same, 
and the compiler shouldn't choose to optimize the code. Consider:

myobject.a()
myobject.b()

You cannot assume that it is the same myobject in both statements! 
Method a might have rebound the name to something else. Perhaps Victor's 
FAT Python will be smart enough to tell whether or not a global name 
myobject has been changed, but how about:

myobject[1].lookup['key'].property.a()
myobject[1].lookup['key'].property.b()

I don't think any Python compiler is going to be able to look deep 
inside an arbitrarily complex reference and tell whether or not it is 
safe to optimize. But the programmer may be able to explicitly choose 
the optimization.

Of course, in this case, they can just as easily factor out the constant 
part and assign it to a temporary short name:

m = myobject[1].lookup['key']
m.property.a()
m.property.b()

# versus

using myobject[1].lookup['key']:
    .property.a()
    .property.b()

so it becomes a matter of two things: taste and performance. In my 
opinion, the two idioms are roughly the same in readability, give or 
take a few concerns about temp variables versus grit on Tim's monitor. 
But the first has to do two name lookups of "m", which the second can 
optimize. Can the second beat that by a significant amount? What if 
there were ten name lookups? Thirty?

-- 
Steve