[Python-ideas] Optimizing builtins

Sat Jan 1 02:52:54 CET 2011

Guido van Rossum wrote:
> [Changed subject *and* list]
> 
>> 2010/12/31 Maciej Fijalkowski <fijall at gmail.com>
>>> How do you know that range is a builtin you're thinking
>>> about and not some other object?
> 
> On Fri, Dec 31, 2010 at 7:02 AM, Cesare Di Mauro
> <cesare.di.mauro at gmail.com> wrote:
>> By a special opcode which could do this work. ]:-)
> 
> That can't be the answer, because then the question would become "how
> does the compiler know it can use the special opcode". This particular
> issue (generating special opcodes for certain builtins) has actually
> been discussed many times before. Alas, given Python's extremely
> dynamic promises it is very hard to do it in a way that is
> *guaranteed* not to change the semantics.

Just tossing ideas out here... pardon me if they've been discussed 
before, but I read the three PEPs you mentioned later (266, 267 and 280) 
and they didn't cover any of this.

I wonder whether we need to make that guarantee? Perhaps we should 
distinguish between "safe" optimizations, like constant folding which 
can't change behaviour, and "unsafe" optimizations which can go wrong 
under (presumably) rare circumstances. The compiler can continue to 
apply whatever safe optimizations it likes, but unsafe optimizations 
must be explicitly asked for by the user. If subtle or not subtle bugs 
occur, well, Python does allow people to shoot themselves in the foot.

There's precedence for this. Both -O and -OO optimization switches 
potentially change behaviour. -O *should* be safe if code only uses 
asserts for assertions, but many people (especially beginners) use 
assert for input checking. If their code breaks under -O they have 
nobody to blame but themselves. Might we not say that -OO will optimize 
access to builtins, and if things break, the solution is not to use -OO?

[...]
> Now, *in practice* such manipulations are rare (with the possible
> exception of people replacing open() with something providing hooks
> for e.g. a virtual filesystem) and there is probably some benefit to
> be had. (I expect that the biggest benefit might well be from
> replacing len() with an opcode.) I have in the past proposed to change
> the official semantics of the language subtly to allow such
> optimizations (i.e. recognizing builtins and replacing them with
> dedicated opcodes). There should also be a simple way to disable them,
> e.g. by setting "len = len" at the top of a module, one would be
> signalling that len() is not to be replaced by an opcode. But it
> remains messy and nobody has really gotten very far with implementing
> this. It is certainly not "low-hanging fruit" to do it properly.

Here's another thought... suppose (say) "builtin" became a reserved 
word. builtin.range (for example) would always refer to the built-in 
range, and could be optimized by the compiler. It wouldn't do much for 
the general case of wanting to optimize non-built-in globals, but this 
could be optimized safely:

def f():
     for i in builtin.range(10): builtin.print(i)

while this would keep the current semantics:

def f():
     for i in range(10): print(i)

-- 
Steven