[Python-Dev] Instance variable access and descriptors

Aahz aahz at pythoncraft.com
Mon Jun 11 00:37:12 CEST 2007


On Sun, Jun 10, 2007, Eyal Lotem wrote:
> 
> Python, probably through the valid assumption that most attribute
> lookups go to the class, tries to look for the attribute in the class
> first, and in the instance, second.
> 
> What Python currently does is quite peculiar!
> Here's a short description o PyObject_GenericGetAttr:
> 
> A. Python looks for a descriptor in the _entire_ mro hierarchy
> (len(mro) class/type check and dict lookups).
> B. If Python found a descriptor and it has both get and set functions
> - it uses it to get the value and returns, skipping the next stage.
> C. If Python either did not find a descriptor, or found one that has
> no setter, it will try a lookup in the instance dict.
> D. If Python failed to find it in the instance, it will use the
> descriptor's getter, and if it has no getter it will use the
> descriptor itself.

Guido, Ping, and I tried working on this at the sprint for PyCon 2003.
We were unable to find any solution that did not affect critical-path
timing.  As other people have noted, the current semantics cannot be
changed.  I'll also echo other people and suggest that this discusion be
moved to python-ideas if you want to continue pushing for a change in
semantics.

I just did a Google for my notes from PyCon 2003 and it appears that I
never sent them out (probably because they aren't particularly
comprehensible).  Here they are for the record (from 3/25/2003):

'''
CACHE_ATTR is the name used to describe a speedup (for new-style classes
only) in attribute lookup by caching the location of attributes in the
MRO.  Some of the non-obvious bits of code:

* If a new-style class has any classic classes in its bases, we
can't do attribute caching (we need to weakrefs to the derived
classes).

* If searching the MRO for an attribute discovers a data descriptor (has
tp_descr_set), that overrides any attribute that might be in the instance;
however, the existence of tp_descr_get still permits the instance to
override its bases (but tp_descr_get is called if there is no instance
attribute).

* We need to invalidate the cache for the updated attribute in all derived
classes in the following cases:

    * an attribute is added or deleted to the class or its base classes

    * an attribute has its status changed to or from being a data
    descriptor

This file uses Python pseudocode to describe changes necessary to
implement CACHE_ATTR at the C level.  Except for class Meta, these are
all exact descriptions of the work being done.  Except for class Meta the
changes go into object.c (Meta goes into typeobject.c).  The pseudocode
looks somewhat C-like to ease the transformation.
'''

NULL = object()

def getattr(inst, name):
    isdata, where = lookup(inst.__class__, name)
    if isdata:
        descr = where[name]
        if hasattr(descr, "__get__"):
            return descr.__get__(inst)
        else:
            return descr
    value = inst.__dict__.get(name, NULL)
    if value != NULL:
        return value
    if where == NULL:
        raise AttributError
    descr = where[name]
    if hasattr(descr, "__get__"):
        value = descr.__get__(inst)
    else:
        value = descr
    return value

def setattr(inst, name, value):
    isdata, where = lookup(inst.__class__, name)
    if isdata:
        descr = where[name]
        descr.__set__(inst, value)
        return
    inst.__dict__[name] = value

def lookup(cls, name):
    if cls.__cache__ != NULL:
        pair = cls.__cache__.get(name)
    else:
        pair = NULL
    if pair:
        return pair
    else:
        for c in cls.__mro__:
            where = c.__dict__
            if name in where:
                descr = where[name]
                isdata = hasattr(descr, "__set__")
                pair = isdata, where
                break
        else:
            pair = False, NULL
    if cls.__cache__ != NULL:
        cls.__cache__[name] = pair
    return pair


'''
These changes go into typeobject.c; they are not a complete
description of what happens during creation/updates, only the
changes necessary to implement CACHE_ATTRO.
'''

from types import ClassType

class Meta(type):
    def _invalidate(cls, name):
        if name in cls.__cache__:
            del cls.__cache__[name]
        for c in cls.__subclasses__():
            if name not in c.__dict__:
                self._invalidate(c, name)
    def _build_cache(cls, bases):
        for base in bases:
            if type(base.__class__) is ClassType:
                cls.__cache__ = NULL
                break
        else:
            cls.__cache__ = {}
    def __new__ (cls, bases):
        self._build_cache(cls, bases)
    def __setbases__(cls, bases):
        self._build_cache(cls, bases)
    def __setattr__(cls, name, value):
        if cls.__cache__ != NULL:
            old = cls.__dict__.get(name, NULL)
            wasdata = old != NULL and hasattr(old, "__set__")
            isdata = value != NULL and hasattr(value, "__set__")
            if wasdata != isdata or (old == NULL) != (value === NULL):
                self._invalidate(cls, name)
        type.__setattr__(cls, name, value)
    def __delattr__(cls, name):
        self.__setattr__(cls, name, NULL)
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"as long as we like the same operating system, things are cool." --piranha


More information about the Python-Dev mailing list