Re: [Python-Dev] Possible optimization for LOAD_FAST ?

Jan. 3, 2011

      On Sun, Jan 2, 2011 at 9:36 PM, Terry Reedy <tjreedy@udel.edu> wrote:
...
On 1/2/2011 10:18 PM, Guido van Rossum wrote:
...
My proposed way out of this conundrum has been to change the language
semantics slightly so that global names which (a) coincide with a
builtin, and (b) have no explicit assignment to them in the current
module, would be fair game for such optimizations, with the
understanding that the presence of e.g. "len = len" anywhere in the
module (even in dead code!) would be sufficient to disable the
optimization.
I believe this amounts to saying
1) Python code executes in three scopes (rather than two): global builtin,
modular (misleadingly call global), and local. This much is a possible
viewpoint today.
In fact it is the specification today.
...
2) A name that is not an assignment target anywhere -- and that matches a
builtin name -- is treated as a builtin. This is the new part, and it
amounts to a rule for entire modules that is much like the current rule for
separating local and global names within a function. The difference from the
global/local rule would be that unassigned non-builtin names would be left
to runtime resolution in globals.
It would seem that this new rule would simplify the lookup of module
('global') names since if xxx in not in globals, there is no need to look in
builtins. This is assuming that following 'len=len' with 'del len' cannot
'unmodularize' the name.
Actually I would leave the lookup mechanism for names that don't get
special treatment the same -- the only difference would be for
builtins in contexts where the compiler can generate better code
(typically involving a new opcode) based on all the conditions being
met.
...
For the rule to work 'retroactively' within a module as it does within
functions would require a similar preliminary pass.
We actually already do such a pass.
...
So it could not work interactively.
That's fine. We could also disable it automatically in when eval() or
exec() is the source of the code.
...
Should batch mode main modules work the same as when
imported?
Yes.
...
Interactive mode could work as it does at present or with slight
modification, which would be that builtin names within functions, if not yet
overridden, also get resolved when the function is compiled.
Interactive mode would just work as it does today.

I would also make a rule saying that 'open' is not treated this way.
It is the only one where I can think of legitimate reasons for
changing the semantics dynamically in a way that is not detectable by
the compiler, assuming it only sees the source code for one module at
a time.

Some things that could be optimized this way: len(x), isinstance(x,
(int, float)), range(10), issubclass(x, str), bool(x), int(x),
hash(x), etc... in general, the less the function does the better a
target for this optimization it is.

One more thing: to avoid heisenbugs, I propose that, for any
particular builtin, if this optimization is used anywhere in a module,
it is should be used everywhere in that module (except in scopes where
the name has a different meaning). This means that we can tell users
about it and they can observe it without too much of a worry that a
slight change to their program might disable it. (I've seen this with
optimizations in gcc, and it makes performance work tricky.)

Still, it's all academic until someone implements some of the
optimizations. (There rest of the work is all in the docs and in the
users' minds.)

-- 
--Guido van Rossum (python.org/~guido)