[Types-sig] Low-hanging fruit: recognizing builtins
M.-A. Lemburg
mal@lemburg.com
Wed, 15 Dec 1999 19:14:20 +0100
Guido van Rossum wrote:
>
> It's always bothered me from a performance point of view that using a
> built-in costs at least two dict lookups, one failing (in the modules'
> globals), one succeeding (in the builtins). This is done so that you
> can define globals or locals that override the occasional builtin;
> which is good since new Python versions can define new builtins, and
> if you weren't allowed to override builtins this would break old code.
>
> Here's a way that per-module analysis plus a conservative assumption
> plus an addition to the PVM (Python Virtual Machine) bytecode can
> remove *both* dict lookups for most uses of builtins.
>
> Per-module analysis can easily detect that there are no global
> variables named "len", say. In this case, any expression calling
> len() on some object can be transformed into a new bytecode that calls
> PyObjectt_Length() on the object at the top of the stack. Thus, a
> sequence like
>
> LOAD_GLOBAL 0 (len)
> LOAD_FAST 0 (a)
> CALL_FUNCTION 1
>
> can be replaced by
>
> LOAD_FAST 0 (a)
> UNARY_LEN
>
> which saves one PVM roundtrip and two dictionary lookups, plus the
> argument checking code inside the len() function.
>
> There are plenty of bytecodes available.
>
> In addition, we can now optimize common idioms involving builtins, the
> most important one probably
>
> for i in range(n): ...
>
> We lose a tiny bit of dynamic semantics: if some bozo replaces
> __builtin__.len with something that always returns 0, this won't
> affect our module. Do we care? I don't. We don't have to do this
> for *every* builtin; for example __import__() has as explicit
> semantics that you can replace it with something else; for open() I
> would guess that there must be plenty of programs that play tricks
> with it. But range()? Or len()? Or type()? I don't think anybody
> would care if these were less dynamic. Note that you can still
> override these easily as globals, it just has to be visible to the
> global analyzer.
>
> The per-module analysis required is major compared to what's currently
> happening in compile.c (which only checks one function at a time
> looking for assignments to locals) but minor compared to any serious
> type inferencing. Clearly this does nothing for (ERR), but I bet it
> could speed up the typical Python program with a substantial factor...
I like this :-)
How about also adding caching of globals
which are not modified within the module in locals ? This
would save another cylce or two. The caching would have
to take place during function creation time. I'm currently doing
this by hand which results in ugly code... :-( but faster execution
:-)
Note that interning the builtins as byte codes could be
a security risk when executing in a restricted environment,
though. Some builtin operations might not be allowed and but would
still be available via bytecode.
--
Marc-Andre Lemburg
______________________________________________________________________
Y2000: 16 days left
Business: http://www.lemburg.com/
Python Pages: http://www.lemburg.com/python/