[Types-sig] Low-hanging fruit: recognizing builtins

M.-A. Lemburg mal@lemburg.com
Wed, 15 Dec 1999 19:14:20 +0100


Guido van Rossum wrote:
> 
> It's always bothered me from a performance point of view that using a
> built-in costs at least two dict lookups, one failing (in the modules'
> globals), one succeeding (in the builtins).  This is done so that you
> can define globals or locals that override the occasional builtin;
> which is good since new Python versions can define new builtins, and
> if you weren't allowed to override builtins this would break old code.
> 
> Here's a way that per-module analysis plus a conservative assumption
> plus an addition to the PVM (Python Virtual Machine) bytecode can
> remove *both* dict lookups for most uses of builtins.
> 
> Per-module analysis can easily detect that there are no global
> variables named "len", say.  In this case, any expression calling
> len() on some object can be transformed into a new bytecode that calls
> PyObjectt_Length() on the object at the top of the stack.  Thus, a
> sequence like
> 
>           LOAD_GLOBAL         0 (len)
>           LOAD_FAST           0 (a)
>           CALL_FUNCTION       1
> 
> can be replaced by
> 
>           LOAD_FAST           0 (a)
>           UNARY_LEN
> 
> which saves one PVM roundtrip and two dictionary lookups, plus the
> argument checking code inside the len() function.
> 
> There are plenty of bytecodes available.
> 
> In addition, we can now optimize common idioms involving builtins, the
> most important one probably
> 
>      for i in range(n): ...
> 
> We lose a tiny bit of dynamic semantics: if some bozo replaces
> __builtin__.len with something that always returns 0, this won't
> affect our module.  Do we care?  I don't.  We don't have to do this
> for *every* builtin; for example __import__() has as explicit
> semantics that you can replace it with something else; for open() I
> would guess that there must be plenty of programs that play tricks
> with it.  But range()?  Or len()?  Or type()?  I don't think anybody
> would care if these were less dynamic.  Note that you can still
> override these easily as globals, it just has to be visible to the
> global analyzer.
> 
> The per-module analysis required is major compared to what's currently
> happening in compile.c (which only checks one function at a time
> looking for assignments to locals) but minor compared to any serious
> type inferencing.  Clearly this does nothing for (ERR), but I bet it
> could speed up the typical Python program with a substantial factor...

I like this :-)

How about also adding caching of globals
which are not modified within the module in locals ? This
would save another cylce or two. The caching would have
to take place during function creation time. I'm currently doing
this by hand which results in ugly code... :-( but faster execution
:-)

Note that interning the builtins as byte codes could be
a security risk when executing in a restricted environment,
though. Some builtin operations might not be allowed and but would
still be available via bytecode.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Y2000:                                                    16 days left
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/