[Python-Dev] Fast access to __builtins__

Raymond Hettinger python@rcn.com
Sat, 29 Mar 2003 02:11:03 -0500


> > The fruit is a bit high.  Doing a full module analysis means
> > deferring the optimization for a second pass after all the code
> > has already been generated.  It's doable, but much harder.
> 
> You're stuck in a one-pass compiler mindset.  We build a parse tree
> for the entire module before we start generating bytecode.  We already
> have tools to do namespace analysis for the entire tree (Jeremy added
> these to implement nested scopes).
 . . .
> > The task is much simpler if it can be known in advance that
> > the substitution is allowed (i.e. a module level switch like:
> > __fastbuiltins__ = True).
> 
> -1000.

Having ruled out a module level switch, the -O flag, and the -OO
flag, that leaves the namespace analysis of the entire tree or taking
an approach that doesn't change the bytecode.  

Taking the second approach, I've loaded a small patch for caching
lookups into the __builtins__ namespace:

       www.python.org/sf/711722

It's not as fast as using LOAD_CONST, but is safe in all but one
extreme case:  calling the function, having an intervening poke
into the __builtins__ module, and then calling the function again.

I put the cache lookup in the safest possible place.  It can be made
twice as fast by putting it before the func_globals() lookup.  That 
works in all cases except:  calling the function, having an intervening 
shadowing global assignment, and then calling the function again.
This doesn't come-up anywhere in the test suite, my own apps,
or apps I've downloaded.  Note, regular shadowing (before the first
function call) continues to work fine.

The bad news is that I've made many timings and found only modest
speed-ups in real code.  It turns out that access time for builtins is
less significant than the time to call and execute those builtins. 
But, every little bit helps.


Raymond Hettinger