[Python-Dev] Possible optimization for LOAD_FAST ?

Mon Jan 3 16:58:50 CET 2011

On Sun, Jan 2, 2011 at 9:36 PM, Terry Reedy <tjreedy at udel.edu> wrote:
> On 1/2/2011 10:18 PM, Guido van Rossum wrote:
>
>> My proposed way out of this conundrum has been to change the language
>> semantics slightly so that global names which (a) coincide with a
>> builtin, and (b) have no explicit assignment to them in the current
>> module, would be fair game for such optimizations, with the
>> understanding that the presence of e.g. "len = len" anywhere in the
>> module (even in dead code!) would be sufficient to disable the
>> optimization.
>
> I believe this amounts to saying
>
> 1) Python code executes in three scopes (rather than two): global builtin,
> modular (misleadingly call global), and local. This much is a possible
> viewpoint today.

In fact it is the specification today.

> 2) A name that is not an assignment target anywhere -- and that matches a
> builtin name -- is treated as a builtin. This is the new part, and it
> amounts to a rule for entire modules that is much like the current rule for
> separating local and global names within a function. The difference from the
> global/local rule would be that unassigned non-builtin names would be left
> to runtime resolution in globals.
>
> It would seem that this new rule would simplify the lookup of module
> ('global') names since if xxx in not in globals, there is no need to look in
> builtins. This is assuming that following 'len=len' with 'del len' cannot
> 'unmodularize' the name.

Actually I would leave the lookup mechanism for names that don't get
special treatment the same -- the only difference would be for
builtins in contexts where the compiler can generate better code
(typically involving a new opcode) based on all the conditions being
met.

> For the rule to work 'retroactively' within a module as it does within
> functions would require a similar preliminary pass.

We actually already do such a pass.

> So it could not work interactively.

That's fine. We could also disable it automatically in when eval() or
exec() is the source of the code.

> Should batch mode main modules work the same as when
> imported?

Yes.

> Interactive mode could work as it does at present or with slight
> modification, which would be that builtin names within functions, if not yet
> overridden, also get resolved when the function is compiled.

Interactive mode would just work as it does today.

I would also make a rule saying that 'open' is not treated this way.
It is the only one where I can think of legitimate reasons for
changing the semantics dynamically in a way that is not detectable by
the compiler, assuming it only sees the source code for one module at
a time.

Some things that could be optimized this way: len(x), isinstance(x,
(int, float)), range(10), issubclass(x, str), bool(x), int(x),
hash(x), etc... in general, the less the function does the better a
target for this optimization it is.

One more thing: to avoid heisenbugs, I propose that, for any
particular builtin, if this optimization is used anywhere in a module,
it is should be used everywhere in that module (except in scopes where
the name has a different meaning). This means that we can tell users
about it and they can observe it without too much of a worry that a
slight change to their program might disable it. (I've seen this with
optimizations in gcc, and it makes performance work tricky.)

Still, it's all academic until someone implements some of the
optimizations. (There rest of the work is all in the docs and in the
users' minds.)

-- 
--Guido van Rossum (python.org/~guido)