[Python-ideas] Add specialized bytecode with guards to functions

Wed Oct 21 21:45:50 CEST 2015

On Oct 21, 2015, at 10:35, Random832 <random832 at fastmail.com> wrote:
> 
> Victor Stinner <victor.stinner at gmail.com>
> writes:
>> You are changing the semantic of Python. In Python, you should be able
>> to override builtin functions.
> 
> What I am proposing is a generalized way to add objects that do not have
> a literal to a function's constants pool. The global name within the
> first version of the function is simply used as a placeholder for
> loading the new constant.
> 
> It could just as easily be something like
> 
> def foo():
>   return _1(abc)
> 
> foo = add_consts(foo, {'_1':len})
> 
>> Binding a global name to a local name is a known trick to optimize a
>> function, but you have to write it explicitly, and again you loose the
>> ability to override len.
> 
> This would also have to be written explicitly, and the whole point is to
> lose the ability to override things. Explicitly.
> 
>> Example:
>> 
>> def f(len=len):
>>   return len("abc")
>> 
>> Here len becomes a fast local variable.
> 
> It'd be even faster as a constant, and that way people couldn't call it
> as f(len=whatever) [so an optimizer doesn't have to deal with the
> possibility of people doing that]
> 
>>> Decorators could be included to freeze only built-ins, to freeze all
>>> globals and built-ins, or to have a list of names to exclude. So you
>>> could have @freeze_builtins or @freeze_all_globals for each function
>>> that needs heavy optimization.
>> 
>> I cited "previous attempts" which failed.
>> 
>> What you describe is close to my "readonly" attempt which tried to
>> make module and type namespaces readonly:
>> https://hg.python.org/sandbox/readonly/file/tip/READONLY.txt
> 
> This would apply to the function, not the namespace.
> 
>> Making namespaces read-only simply doesn't work. It's very common to
>> modify namespaces. I added callbacks to disable optimizations when a
>> namespace is modified. But this design doesn't scale: you add a
>> complexity of O(n) when a namespace is modified, even if optimized
>> functions are never called.
> 
> Nothing happens when a namespace is modified, because the function is no
> longer using the namespace in any form. This adds to the cost of
> defining a function, but in most cases (I'm not 100% sure of the
> semantics for nested functions, let alone the implementation) this could
> be done at compile time, so it would be a one-time cost.

Nested functions are just like top-level functions: at compile time, the function body is compiled to a code object (bytecode, a list of free variables, etc.) that gets stored as a hidden constant; at runtime, all that happens is building a function object out of it (and, if def rather than lambda, binding it to a name). The only way nested functions are more costly at runtime is that they're more likely to have free variables that have to get bound to cells, but even that isn't a big deal.

Anyway, if you just assume that free variables can never be optimized out (which seems fair for a first implementation), you should be able to treat nested functions the same as global functions.

If you want to later optimize out free variables bound to constants, you need to add some way to guard the local namespace—which isn't really a dict, but an array in each frame (although you only have to guard the frames that have cells bound to some closure). That seems doable too, but probably worth skipping in the first version.