Mailman 3 PEP 447 (type.__getdescriptor__) - Python-Dev

July 22, 2015

      Hi,

Another summer with another EuroPython, which means its time again to try to revive PEP 447…

I’ve just pushes a minor update to the PEP and would like to get some feedback on this, arguably fairly esoteric, PEP.

The PEP proposes to to replace direct access to the class __dict__ in object.__getattribute__ and super.__getattribute__ by calls to a new special method to give the metaclass more control over attribute lookup, especially for access using a super() object.  This is needed for classes that cannot store (all) descriptors in the class dict for some reason, see the PEP text for a real-world example of that.

Regards,

  Ronald

The PEP text (with an outdated section with benchmarks removed):

PEP: 447
Title: Add __getdescriptor__ method to metaclass
Version: $Revision$
Last-Modified: $Date$
Author: Ronald Oussoren <ronaldoussoren@mac.com>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 12-Jun-2013
Post-History: 2-Jul-2013, 15-Jul-2013, 29-Jul-2013, 22-Jul-2015

Abstract
========

Currently ``object.__getattribute__`` and ``super.__getattribute__`` peek
in the ``__dict__`` of classes on the MRO for a class when looking for
an attribute. This PEP adds an optional ``__getdescriptor__`` method to
a metaclass that replaces this behavior and gives more control over attribute
lookup, especially when using a `super`_ object.

That is, the MRO walking loop in ``_PyType_Lookup`` and
``super.__getattribute__`` gets changed from::

     def lookup(mro_list, name):
         for cls in mro_list:
             if name in cls.__dict__:
                 return cls.__dict__

         return NotFound

to::

     def lookup(mro_list, name):
         for cls in mro_list:
             try:
                 return cls.__getdescriptor__(name)
             except AttributeError:
                 pass

         return NotFound

The default implemention of ``__getdescriptor__`` looks in the class
dictionary::

   class type:
      def __getdescriptor__(cls, name):
          try:
	      return cls.__dict__[name]
	  except KeyError:
	      raise AttributeError(name) from None

Rationale
=========

It is currently not possible to influence how the `super class`_ looks
up attributes (that is, ``super.__getattribute__`` unconditionally
peeks in the class ``__dict__``), and that can be problematic for
dynamic classes that can grow new methods on demand.

The ``__getdescriptor__`` method makes it possible to dynamically add
attributes even when looking them up using the `super class`_.

The new method affects ``object.__getattribute__`` (and
`PyObject_GenericGetAttr`_) as well for consistency and to have a single
place to implement dynamic attribute resolution for classes.

Background
----------

The current behavior of ``super.__getattribute__`` causes problems for
classes that are dynamic proxies for other (non-Python) classes or types,
an example of which is `PyObjC`_. PyObjC creates a Python class for every
class in the Objective-C runtime, and looks up methods in the Objective-C
runtime when they are used. This works fine for normal access, but doesn't
work for access with `super`_ objects. Because of this PyObjC currently
includes a custom `super`_ that must be used with its classes, as well as
completely reimplementing `PyObject_GenericGetAttr`_ for normal attribute
access.

The API in this PEP makes it possible to remove the custom `super`_ and
simplifies the implementation because the custom lookup behavior can be
added in a central location.

.. note::

   `PyObjC`_ cannot precalculate the contents of the class ``__dict__``
   because Objective-C classes can grow new methods at runtime. Furthermore
   Objective-C classes tend to contain a lot of methods while most Python
   code will only use a small subset of them, this makes precalculating
   unnecessarily expensive.

The superclass attribute lookup hook
====================================

Both ``super.__getattribute__`` and ``object.__getattribute__`` (or
`PyObject_GenericGetAttr`_ and in particular ``_PyType_Lookup`` in C code)
walk an object's MRO and currently peek in the class' ``__dict__`` to look up
attributes.

With this proposal both lookup methods no longer peek in the class ``__dict__``
but call the special method ``__getdescriptor__``, which is a slot defined
on the metaclass. The default implementation of that method looks
up the name the class ``__dict__``, which means that attribute lookup is
unchanged unless a metatype actually defines the new special method.

Aside: Attribute resolution algorithm in Python
-----------------------------------------------

The attribute resolution proces as implemented by ``object.__getattribute__``
(or PyObject_GenericGetAttr`` in CPython's implementation) is fairly
straightforward, but not entirely so without reading C code.

The current CPython implementation of object.__getattribute__ is basicly
equivalent to the following (pseudo-) Python code (excluding some house keeping and speed tricks)::

    def _PyType_Lookup(tp, name):
        mro = tp.mro()
	assert isinstance(mro, tuple)

	for base in mro:
	   assert isinstance(base, type)

	   # PEP 447 will change these lines:
	   try:
	       return base.__dict__[name]
	   except KeyError:
	       pass

	return None

    class object:
        def __getattribute__(self, name):
	    assert isinstance(name, str)

	    tp = type(self)
	    descr = _PyType_Lookup(tp, name)

	    f = None
	    if descr is not None:
	        f = descr.__get__
		if f is not None and descr.__set__ is not None:
		    # Data descriptor
		    return f(descr, self, type(self))

            dict = self.__dict__
	    if dict is not None:
	        try:
		    return self.__dict__[name]
                except KeyError:
	            pass

            if f is not None:
	        # Non-data descriptor
	        return f(descr, self, type(self))

            if descr is not None:
	        # Regular class attribute
	        return descr

            raise AttributeError(name)

    class super:
        def __getattribute__(self, name):
	   assert isinstance(name, unicode)

	   if name != '__class__':
	       starttype = self.__self_type__
	       mro = startype.mro()

	       try:
	           idx = mro.index(self.__thisclass__)

	       except ValueError:
	           pass

	       else:
	           for base in mro[idx+1:]:
		       # PEP 447 will change these lines:
		       try:
		           descr = base.__dict__[name]
                       except KeyError:
		           continue

		       f = descr.__get__
		       if f is not None:
		           return f(descr,
			       None if (self.__self__ is self.__self_type__) else self.__self__,
			       starttype)

		       else:
		           return descr

	   return object.__getattribute__(self, name)

This PEP should change the dict lookup at the lines starting at "# PEP 447" with
a method call to perform the actual lookup, making is possible to affect that
lookup both for normal attribute access and access through the `super proxy`_.

Note that specific classes can already completely override the default
behaviour by implementing their own ``__getattribute__`` slot (with or without
calling the super class implementation).

In Python code
--------------

A meta type can define a method ``__getdescriptor__`` that is called during
attribute resolution by both ``super.__getattribute__``
and ``object.__getattribute``::

    class MetaType(type):
        def __getdescriptor__(cls, name):
            try:
                return cls.__dict__[name]
            except KeyError:
                raise AttributeError(name) from None

The ``__getdescriptor__`` method has as its arguments a class (which is an
instance of the meta type) and the name of the attribute that is looked up.
It should return the value of the attribute without invoking descriptors,
and should raise `AttributeError`_ when the name cannot be found.

The `type`_ class provides a default implementation for ``__getdescriptor__``,
that looks up the name in the class dictionary.

Example usage
.............

The code below implements a silly metaclass that redirects attribute lookup to
uppercase versions of names::

    class UpperCaseAccess (type):
        def __getdescriptor__(cls, name):
	    try:
                return cls.__dict__[name.upper()]
	    except KeyError:
	        raise AttributeError(name) from None

    class SillyObject (metaclass=UpperCaseAccess):
        def m(self):
            return 42

        def M(self):
            return "fourtytwo"

    obj = SillyObject()
    assert obj.m() == "fortytwo"

As mentioned earlier in this PEP a more realistic use case of this
functionallity is a ``__getdescriptor__`` method that dynamicly populates the
class ``__dict__`` based on attribute access, primarily when it is not
possible to reliably keep the class dict in sync with its source, for example
because the source used to populate ``__dict__`` is dynamic as well and does
not have triggers that can be used to detect changes to that source.

An example of that are the class bridges in PyObjC: the class bridge is a
Python object (class) that represents an Objective-C class and conceptually
has a Python method for every Objective-C method in the Objective-C class.
As with Python it is possible to add new methods to an Objective-C class, or
replace existing ones, and there are no callbacks that can be used to detect
this.

In C code
---------

A new slot ``tp_getdescriptor`` is added to the ``PyTypeObject`` struct, this
slot corresponds to the ``__getdescriptor__`` method on `type`_.

The slot has the following prototype::

    PyObject* (*getdescriptorfunc)(PyTypeObject* cls, PyObject* name);

This method should lookup *name* in the namespace of *cls*, without looking at
superclasses, and should not invoke descriptors. The method returns ``NULL``
without setting an exception when the *name* cannot be found, and returns a
new reference otherwise (not a borrowed reference).

Use of this hook by the interpreter
-----------------------------------

The new method is required for metatypes and as such is defined on `type_`.
Both ``super.__getattribute__`` and
``object.__getattribute__``/`PyObject_GenericGetAttr`_
(through ``_PyType_Lookup``) use the this ``__getdescriptor__`` method when
walking the MRO.

Other changes to the implementation
-----------------------------------

The change for `PyObject_GenericGetAttr`_ will be done by changing the private
function ``_PyType_Lookup``. This currently returns a borrowed reference, but
must return a new reference when the ``__getdescriptor__`` method is present.
Because of this ``_PyType_Lookup`` will be renamed to ``_PyType_LookupName``,
this will cause compile-time errors for all out-of-tree users of this
private API.

The attribute lookup cache in ``Objects/typeobject.c`` is disabled for classes
that have a metaclass that overrides ``__getdescriptor__``, because using the
cache might not be valid for such classes.

Impact of this PEP on introspection
-----------------------------------

Use of the method introduced in this PEP can affect introspection of classes
with a metaclass that uses a custom ``__getdescriptor__`` method. This section
lists those changes.

The items listed below are only affected by custom ``__getdescriptor__``
methods, the default implementation for ``object`` won't cause problems
because that still only uses the class ``__dict__`` and won't cause visible
changes to the visible behaviour of the ``object.__getattribute__``.

* ``dir`` might not show all attributes

  As with a custom ``__getattribute__`` method `dir()`_ might not see all
  (instance) attributes when using the ``__getdescriptor__()`` method to
  dynamicly resolve attributes.

  The solution for that is quite simple: classes using ``__getdescriptor__``
  should also implement `__dir__()`_ if they want full support for the builtin
  `dir()`_ function.

* ``inspect.getattr_static`` might not show all attributes

  The function ``inspect.getattr_static`` intentionally does not invoke
  ``__getattribute__`` and descriptors to avoid invoking user code during
  introspection with this function. The ``__getdescriptor__`` method will also
  be ignored and is another way in which the result of ``inspect.getattr_static``
  can be different from that of ``builtin.getattr``.

* ``inspect.getmembers`` and ``inspect.get_class_attrs``

  Both of these functions directly access the class __dict__ of classes along
  the MRO, and hence can be affected by a custom ``__getdescriptor__`` method.

  **TODO**: I haven't fully worked out what the impact of this is, and if there
  are mitigations for those using either updates to these functions, or
  additional methods that users should implement to be fully compatible with
  these functions.

  One possible mitigation is to have a custom ``__getattribute__`` for these
  classes that fills ``__dict__`` before returning and and defers to the
  default implementation for other attributes.

* Direct introspection of the class ``__dict__``

  Any code that directly access the class ``__dict__`` for introspection
  can be affected by a custom ``__getdescriptor__`` method.

Performance impact
------------------

**WARNING**: The benchmark results in this section are old, and will be updated
when I've ported the patch to the current trunk. I don't expect significant
changes to the results in this section.

[snipped]

Alternative proposals
---------------------

``__getattribute_super__``
..........................

An earlier version of this PEP used the following static method on classes::

    def __getattribute_super__(cls, name, object, owner): pass

This method performed name lookup as well as invoking descriptors and was
necessarily limited to working only with ``super.__getattribute__``.

Reuse ``tp_getattro``
.....................

It would be nice to avoid adding a new slot, thus keeping the API simpler and
easier to understand.  A comment on `Issue 18181`_ asked about reusing the
``tp_getattro`` slot, that is super could call the ``tp_getattro`` slot of all
methods along the MRO.

That won't work because ``tp_getattro`` will look in the instance
``__dict__`` before it tries to resolve attributes using classes in the MRO.
This would mean that using ``tp_getattro`` instead of peeking the class
dictionaries changes the semantics of the `super class`_.

Alternate placement of the new method
.....................................

This PEP proposes to add ``__getdescriptor__`` as a method on the metaclass.
An alternative would be to add it as a class method on the class itself
(simular to how ``__new__`` is a `staticmethod`_ of the class and not a method
of the metaclass).

The two are functionally equivalent, and there's something to be said about
not requiring the use of a meta class.

References
==========

* `Issue 18181`_ contains an out of date prototype implementation

Copyright
=========

This document has been placed in the public domain.

.. _`Issue 18181`: http://bugs.python.org/issue18181

.. _`super class`: http://docs.python.org/3/library/functions.html#super

.. _`super proxy`: http://docs.python.org/3/library/functions.html#super

.. _`super`: http://docs.python.org/3/library/functions.html#super

.. _`dir()`: http://docs.python.org/3/library/functions.html#dir

.. _`staticmethod`: http://docs.python.org/3/library/functions.html#staticmethod

.. _`__dir__()`: https://docs.python.org/3/reference/datamodel.html#object.__dir__

.. _`NotImplemented`: http://docs.python.org/3/library/constants.html#NotImplemented

.. _`PyObject_GenericGetAttr`: http://docs.python.org/3/c-api/object.html#PyObject_GenericGetAttr

.. _`type`: http://docs.python.org/3/library/functions.html#type

.. _`AttributeError`: http://docs.python.org/3/library/exceptions.html#AttributeError

.. _`PyObjC`: http://pyobjc.sourceforge.net/

.. _`classmethod`: http://docs.python.org/3/library/functions.html#classmethod

PEP 447 (type.__getdescriptor__)

tags

participants (8)

PEP 447 (type.getdescriptor)