[Python-Dev] Accessing globals without dict lookup
Guido van Rossum
guido@python.org
Fri, 08 Feb 2002 11:50:31 -0500
Inspired by talks by Jeremy and Skip on DevDay, here's a different
idea for speeding up access to globals. It retain semantics but (like
Jeremy's proposal) changes the type of a module's __dict__.
- Let a cell be a really simple PyObject, containing a PyObject
pointer and a cell pointer. Both pointers may be NULL. (It may
have to be called PyGlobalCell since I believe there's already a
PyCell object.) (Maybe it doesn't even have to be an object -- it
could just be a tiny struct.)
- Let a celldict be a mapping that is implemented using a dict of
cells. When you use its getitem method, the PyObject * in the cell
is dereferenced, and if a NULL is found, getitem raises KeyError
even if the cell exists. Using setitem to add a new value creates a
new cell and stores the value there; using setitem to change the
value for an existing key stores the value in the existing cell for
that key. There's a separate API to access the cells.
- We change the module implementation to use a celldict for its
__dict__. The module's getattr and setattr operations now map to
getitem and setitem on the celldict. I think the type of
<module>.__dict__ and globals() is the only backwards
incompatibility.
- When a module is initialized, it gets its __builtins__ from the
__builtin__ module, which is itself a celldict. For each cell in
__builtins__, the new module's __dict__ adds a cell with a NULL
PyObject pointer, whose cell pointer points to the corresponding
cell of __builtins__.
- The compiler generates LOAD_GLOBAL_CELL <i> (and STORE_GLOBAL_CELL
<i> etc.) opcodes for references to globals, where <i> is a small
index with meaning only within one code object like the const index
in LOAD_CONST. The code object has a new tuple, co_globals, giving
the names of the globals referenced by the code indexed by <i>. I
think no new analysis is required to be able to do this.
- When a function object is created from a code object and a celldict,
the function object creates an array of cell pointers by asking the
celldict for cells corresponding to the names in the code object's
co_globals. If the celldict doesn't already have a cell for a
particular name, it creates and an empty one. This array of cell
pointers is stored on the function object as func_cells. When a
function object is created from a regular dict instead of a
celldict, func_cells is a NULL pointer.
- When the VM executes a LOAD_GLOBAL_CELL <i> instruction, it gets
cell number <i> from func_cells. It then looks in the cell's
PyObject pointer, and if not NULL, that's the global value. If it
is NULL, it follows the cell's cell pointer to the next cell, if it
is not NULL, and looks in the PyObject pointer in that cell. If
that's also NULL, or if there is no second cell, NameError is
raised. (It could follow the chain of cell pointers until a NULL
cell pointer is found; but I have no use for this.) Similar for
STORE_GLOBAL_CELL <i>, except it doesn't follow the cell pointer
chain -- it always stores in the first cell.
- There are fallbacks in the VM for the case where the function's
globals aren't a celldict, and hence func_cells is NULL. In that
case, the code object's co_globals is indexed with <i> to find the
name of the corresponding global and this name is used to index the
function's globals dict.
I believe that's it. I think this faithfully implements the current
semantics (where a global can shadow a builtin), but without the need
for any dict lookups when accessing globals, except in cases where an
explicit dict is passed to exec or eval().
Compare this to Jeremy's scheme using dlicts:
http://www.zope.org/Members/jeremy/CurrentAndFutureProjects/FastGlobals
- My approach doesn't require global agreement on the numbering of the
globals; each code object has its own numbering. This avoids the
need for more global analysis, and allows adding code to a module
using exec that introduces new globals without having to fall back
on a less efficient scheme.
- Jeremy's approach might be a teensy bit faster because it may have
to do less work in the LOAD_GLOBAL; but I'm not convinced.
Here's a implementation sketch for cell and celldict. Note in
particular that keys() only returns the keys for which the cell's
objptr is not NULL.
NULL = object() # used as a token
class cell(object):
def __init__(self):
self.objptr = NULL
self.cellptr = NULL
class celldict(object):
def __init__(self):
self.__dict = {} # dict of cells
def getcell(self, key):
c = self.__dict.get(key)
if c is None:
c = cell()
self.__dict[key] = c
return c
def __getitem__(self, key):
c = self.__dict.get(key)
if c is None:
raise KeyError, key
value = c.objptr
if value is NULL:
raise KeyError, key
else:
return value
def __setitem__(self, key, value):
c = self.__dict.get(key)
if c is None:
c = cell()
self.__dict[key] = c
c.objptr = value
def __delitem__(self, key):
c = self.__dict.get(key)
if c is None or c.objptr is NULL:
raise KeyError, key
c.objptr = NULL
def keys(self):
return [c.objptr for c in self.__dict.keys() if c.objptr is not NULL]
def clear(self):
for c in self.__dict.values():
c.objptr = NULL
# Etc.
--Guido van Rossum (home page: http://www.python.org/~guido/)