[Python-Dev] Speeding up instance attribute access
Guido van Rossum
guido@python.org
Fri, 08 Feb 2002 13:09:27 -0500
Inspired by the second half of Jeremy's talk on DevDay, here's my
alternative approach for speeding up instance attribute access. Like
my idea for globals, it uses double indirection rather than
recompilation.
- We only care about attributes of 'self' (which is identified as the
first argument of a method, not by name). We can exclude functions
from our analysis that make any assignment to self -- this is
extremely rare and would throw off our analysis. We should also
exclude static methods and class methods, since their first argument
doesn't have the same role.
- Static analysis of the source code of a class (without access to the
base class) can determine attributes of the class, and to some extent
instance variables. Without also analyzing the base classes, this
analysis cannot reliably distinguish between instance variables and
methods inherited from a base class; it can distinguish between
instance variables and methods defined in the current class.
- We can guess the status of un-assigned-to inherited attributes by
seeing whether they are called or not. This is not 100% accurate,
so we need things to work (if slower) even when we guess wrong.
- For instance variable references and stores of the form self.<name>,
the bytecode compiler emits opcodes LOAD_SELF_IVAR <i> and
STORE_SELF_IVAR <i>, where <i> is a small int identifying the
instance variable (ivar). A particular ivar is identified by the
same <i> throughout all methods defined in the same class statement,
but there is no attempt to coordinate this across different classes
related by inheritance.
- It would be nice if we also had a single-opcode way to express a
method call on self, e.g. CALL_SELF_METHOD <i>, <n>, <k> where <i>
identifies the method like above, and <n> and <k> are the number of
positional and keyword arguments. Or maybe we should just have
LOAD_SELF_METHOD <i> which may be able to skip looking in the
instance dict.
- Some data structure describing the mapping from <i> to attribute
name, and whether it's an ivar or a method, is produced by the
compiler and stored in the class __dict__. The function objects
representing methods also contain a pointer to this data structure.
(Or the code objects? But it needs to be shared. Details, details.)
- When a class object is created (at run-time), another data structure
is created that accumulates the <i>-to-name mappings from that class
and all its base classes.