[Python-ideas] [Python-Dev] Making builtins more efficient

Steven Elliott selliott4 at austin.rr.com
Fri Apr 20 07:30:44 CEST 2007


Thanks for forwarding this.  It took me a while to catch on to the
thread being moved here from python-dev.

On Thu, 2007-02-22 at 14:25 +1300, Greg Ewing wrote:
> Steven Elliott wrote:
> 
> > What I have in mind may be close to what you are suggesting above.
> 
> My idea is somewhat more uniform and general than that.
> 
> For the module dict, you use a special mapping type that
> allows selected items to be accessed by an index as well
> as a name. The set of such names is determined when the
> module's code is compiled -- it's simply the names used
> in that module to refer to globals or builtins.

That sounds like an interesting idea.  How does it differ from PEP 280?:
    http://www.python.org/dev/peps/pep-0280
(assuming PEP 280 isn't what you are describing).  If there is some
place I can read more about this idea, that would be great.

> The first time a given builtin is referenced in the module,
> it will be unbound in the module dict, so it is looked up
> in the usual way and then written into the module dict,
> so it can subsequently be retrieved by index.
> 
> The advantages of this scheme over yours are that it speeds
> access to module-level names as well as builtins, and it
> doesn't require the compiler to have knowledge of a
> predefined set of names.

How does your scheme speed access to module-level names?  Are they
referred to by index?  With your idea would module level names only be
referred to by index internally in the module as global variables
(LOAD_GLOBAL)?  It seems like referring to attributes inside the module
from outside the module (LOAD_ATTR) would require something like a
visioning scheme where the compiler knows in advance knows what version
the module is so that it can get the index right.

Again, if your idea is PEP 280 then my questions in the above paragraph
are answered.

> It does entail a slight semantic change, as changes made
> to a builtin won't be seen by a module that has already
> used that builtin for the first time. But from what Guido
> has said before, it seems he is willing to accept a change
> like that if it will help.

I think slight semantic changes like that are worth it if it buys
greater performance, but I understand the importance of reverse
compatibility as well.

After exploring some of the different ideas for making
globals/bulitins/attributes more efficient it seems to me that there is
an overall tradeoff - How much complexity is justified to avoid hash
table lookups or extra levels of indirection?

For example, PEP 280 avoids a hash table lookup but adds a level of
indirection (the cells point to the value), which is a net performance
gain.  My idea (my last big email) avoids a level of indirection, but
only for a predefined set of names.  And it requires new opcodes.  And
the compiler has to be aware of the predefined set of names.

I have a question about PEP 280, but maybe I'll ask it here since it
seems relevant.

One of the elegant things about the way Python compiles code is that,
for the most part, each function can be compiled independently without
concern for context.  For example, the co_names in the code object for a
function has only the names for that function.  So the co_names for a
given function does not depend on anything outside of that function.

What if PEP 280's proposed co_globals, which is currently only has
globals referenced by that function (similar to co_names), was instead a
pointer to a shared array of globals that was common for the entire
module (one to one with the module's dict)?

I think doing so could avoid a level of indirection, but at the cost of
forcing the compiler to keep track of the indexes of all globals so that
it could get the index right (the index being the "<i>" in the proposed
LOAD_GLOBAL_CELL).  The  celldict might then have indexes into
co_globals instead of cells.  So the cost would be making the compiling
of functions less independent.

-- 
-- 
-----------------------------------------------------------------------
|          Steven Elliott          |      selliott4 at austin.rr.com     |
-----------------------------------------------------------------------





More information about the Python-ideas mailing list