[Python-Dev] PEP 447 (type.__getdescriptor__)
Steve Dower
Steve.Dower at microsoft.com
Thu Jul 23 14:43:03 CEST 2015
I wonder whether XML RPC might be a good example? After all, it's already in the stdlib and presumably suffers from the same issue.
Cheers,
Steve
Top-posted from my Windows Phone
________________________________
From: Ronald Oussoren<mailto:ronaldoussoren at mac.com>
Sent: 7/23/2015 3:07
To: Dima Tisnek<mailto:dimaqq at gmail.com>
Cc: Python Dev<mailto:python-dev at python.org>
Subject: Re: [Python-Dev] PEP 447 (type.__getdescriptor__)
> On 23 Jul 2015, at 11:29, Dima Tisnek <dimaqq at gmail.com> wrote:
>
> Hey I've taken time to read the PEP, my 2c... actually 1c:
>
> Most important is to explain how this changes behaviour of Python programs.
>
> A comprehensive set of Python examples where behaviour is changed (for
> better or worse) please.
The behaviour of existing python code is not changed at all. Code that directly looks in the class __dict__ might be impacted, but only when running into classes with a custom __getdescriptor__ method. I’ve listed the code in the stdlib that could be affected, but have to do a new pass of the stdlib to check if anything relevant changed since I wrote the section. I general you run into the same issues when adding a custom __getattribute__ or __getattr__ method, both of which will confuse introspection tools that assume regular attribute lookup semantics.
>
> While I understand the concern of "superclasses of objects that gain
> or lose attributes at runtime" on the theoretical level, please
> translate that into actual Python examples.
The primary use case I have for this are classes that are proxies for external systems. There may be other uses as well, but I don’t have examples of that (other than the contrived example in the PEP).
The reason I wrote the PEP in the first place is PyObjC: this project defines a proxy later between Python and Objective-C, with the goal to making it possible to write programs for Mac OS X in Python while being able to make full use of Apple’s high-level APIs. The proxy layer is mostly completely dynamic: proxies for Objective-C classes and their methods are created at runtime by inspecting the Objective-C runtime with optionally extra annotations (provided by the project) for stuff that cannot be extracted from the runtime.
That is, at runtime PyObjC creates a Python class “NSObject” that corresponds to the Objective-C class “NSObject” as defined by Apple. Every method of the Objective-C class is make available as a method on the Python proxy class as well.
It is not possible to 100% reliably set up the Python proxy class for “NSObject” with all methods because Objective-C classes can grow new methods at runtime, and the introspection API that Apple provides does not have a way to detect this other than by polling. Older versions of PyObjC did poll, but even that was not relialble enough and left a race condition:
def someAction_(self, sender):
self.someMethod()
self.button.setTitle_(“click me”)
super().someOtherMethod()
The call to “someMethod” used to poll the Objective-C runtime for changes. The call through super() of someOtherMethod() does not do so because of the current semantics of super (which PEP 447 tries to change). That’s a problem because “self.button.setTitle_” might load a bundle that adds “someOtherMethod” to one of our super classes. That sadly enough is not a theoretical concern, I’ve seen something like this in the past.
Because of this PyObjC contains its own version of builtins.super which must be used with it (and is fully compatible with builtin.super for other classes).
Recent versions of PyObjC no longer poll, primarily because polling is costly and because Objective-C classes tend to have fat APIs most of which is never used by any one program.
What bothers me with PyObjC’s current approach is one the one hand that a custom super is inherently incompatible with any other library that might have a simular need, and on there other hand that I have to reimplement all logic in both object.__getattribute__ and super.__getattribute__ to be able to have a customisation of one small aspect of attribute lookup.
Ronald
>
> d.
>
> On 22 July 2015 at 09:25, Ronald Oussoren <ronaldoussoren at mac.com> wrote:
>> Hi,
>>
>> Another summer with another EuroPython, which means its time again to try to
>> revive PEP 447…
>>
>> I’ve just pushes a minor update to the PEP and would like to get some
>> feedback on this, arguably fairly esoteric, PEP.
>>
>> The PEP proposes to to replace direct access to the class __dict__ in
>> object.__getattribute__ and super.__getattribute__ by calls to a new special
>> method to give the metaclass more control over attribute lookup, especially
>> for access using a super() object. This is needed for classes that cannot
>> store (all) descriptors in the class dict for some reason, see the PEP text
>> for a real-world example of that.
>>
>> Regards,
>>
>> Ronald
>>
>>
>> The PEP text (with an outdated section with benchmarks removed):
>>
>> PEP: 447
>> Title: Add __getdescriptor__ method to metaclass
>> Version: $Revision$
>> Last-Modified: $Date$
>> Author: Ronald Oussoren <ronaldoussoren at mac.com>
>> Status: Draft
>> Type: Standards Track
>> Content-Type: text/x-rst
>> Created: 12-Jun-2013
>> Post-History: 2-Jul-2013, 15-Jul-2013, 29-Jul-2013, 22-Jul-2015
>>
>>
>> Abstract
>> ========
>>
>> Currently ``object.__getattribute__`` and ``super.__getattribute__`` peek
>> in the ``__dict__`` of classes on the MRO for a class when looking for
>> an attribute. This PEP adds an optional ``__getdescriptor__`` method to
>> a metaclass that replaces this behavior and gives more control over
>> attribute
>> lookup, especially when using a `super`_ object.
>>
>> That is, the MRO walking loop in ``_PyType_Lookup`` and
>> ``super.__getattribute__`` gets changed from::
>>
>> def lookup(mro_list, name):
>> for cls in mro_list:
>> if name in cls.__dict__:
>> return cls.__dict__
>>
>> return NotFound
>>
>> to::
>>
>> def lookup(mro_list, name):
>> for cls in mro_list:
>> try:
>> return cls.__getdescriptor__(name)
>> except AttributeError:
>> pass
>>
>> return NotFound
>>
>> The default implemention of ``__getdescriptor__`` looks in the class
>> dictionary::
>>
>> class type:
>> def __getdescriptor__(cls, name):
>> try:
>> return cls.__dict__[name]
>> except KeyError:
>> raise AttributeError(name) from None
>>
>> Rationale
>> =========
>>
>> It is currently not possible to influence how the `super class`_ looks
>> up attributes (that is, ``super.__getattribute__`` unconditionally
>> peeks in the class ``__dict__``), and that can be problematic for
>> dynamic classes that can grow new methods on demand.
>>
>> The ``__getdescriptor__`` method makes it possible to dynamically add
>> attributes even when looking them up using the `super class`_.
>>
>> The new method affects ``object.__getattribute__`` (and
>> `PyObject_GenericGetAttr`_) as well for consistency and to have a single
>> place to implement dynamic attribute resolution for classes.
>>
>> Background
>> ----------
>>
>> The current behavior of ``super.__getattribute__`` causes problems for
>> classes that are dynamic proxies for other (non-Python) classes or types,
>> an example of which is `PyObjC`_. PyObjC creates a Python class for every
>> class in the Objective-C runtime, and looks up methods in the Objective-C
>> runtime when they are used. This works fine for normal access, but doesn't
>> work for access with `super`_ objects. Because of this PyObjC currently
>> includes a custom `super`_ that must be used with its classes, as well as
>> completely reimplementing `PyObject_GenericGetAttr`_ for normal attribute
>> access.
>>
>> The API in this PEP makes it possible to remove the custom `super`_ and
>> simplifies the implementation because the custom lookup behavior can be
>> added in a central location.
>>
>> .. note::
>>
>> `PyObjC`_ cannot precalculate the contents of the class ``__dict__``
>> because Objective-C classes can grow new methods at runtime. Furthermore
>> Objective-C classes tend to contain a lot of methods while most Python
>> code will only use a small subset of them, this makes precalculating
>> unnecessarily expensive.
>>
>>
>> The superclass attribute lookup hook
>> ====================================
>>
>> Both ``super.__getattribute__`` and ``object.__getattribute__`` (or
>> `PyObject_GenericGetAttr`_ and in particular ``_PyType_Lookup`` in C code)
>> walk an object's MRO and currently peek in the class' ``__dict__`` to look
>> up
>> attributes.
>>
>> With this proposal both lookup methods no longer peek in the class
>> ``__dict__``
>> but call the special method ``__getdescriptor__``, which is a slot defined
>> on the metaclass. The default implementation of that method looks
>> up the name the class ``__dict__``, which means that attribute lookup is
>> unchanged unless a metatype actually defines the new special method.
>>
>> Aside: Attribute resolution algorithm in Python
>> -----------------------------------------------
>>
>> The attribute resolution proces as implemented by
>> ``object.__getattribute__``
>> (or PyObject_GenericGetAttr`` in CPython's implementation) is fairly
>> straightforward, but not entirely so without reading C code.
>>
>> The current CPython implementation of object.__getattribute__ is basicly
>> equivalent to the following (pseudo-) Python code (excluding some house
>> keeping and speed tricks)::
>>
>>
>> def _PyType_Lookup(tp, name):
>> mro = tp.mro()
>> assert isinstance(mro, tuple)
>>
>> for base in mro:
>> assert isinstance(base, type)
>>
>> # PEP 447 will change these lines:
>> try:
>> return base.__dict__[name]
>> except KeyError:
>> pass
>>
>> return None
>>
>>
>> class object:
>> def __getattribute__(self, name):
>> assert isinstance(name, str)
>>
>> tp = type(self)
>> descr = _PyType_Lookup(tp, name)
>>
>> f = None
>> if descr is not None:
>> f = descr.__get__
>> if f is not None and descr.__set__ is not None:
>> # Data descriptor
>> return f(descr, self, type(self))
>>
>> dict = self.__dict__
>> if dict is not None:
>> try:
>> return self.__dict__[name]
>> except KeyError:
>> pass
>>
>> if f is not None:
>> # Non-data descriptor
>> return f(descr, self, type(self))
>>
>> if descr is not None:
>> # Regular class attribute
>> return descr
>>
>> raise AttributeError(name)
>>
>>
>> class super:
>> def __getattribute__(self, name):
>> assert isinstance(name, unicode)
>>
>> if name != '__class__':
>> starttype = self.__self_type__
>> mro = startype.mro()
>>
>> try:
>> idx = mro.index(self.__thisclass__)
>>
>> except ValueError:
>> pass
>>
>> else:
>> for base in mro[idx+1:]:
>> # PEP 447 will change these lines:
>> try:
>> descr = base.__dict__[name]
>> except KeyError:
>> continue
>>
>> f = descr.__get__
>> if f is not None:
>> return f(descr,
>> None if (self.__self__ is self.__self_type__) else self.__self__,
>> starttype)
>>
>> else:
>> return descr
>>
>> return object.__getattribute__(self, name)
>>
>>
>> This PEP should change the dict lookup at the lines starting at "# PEP 447"
>> with
>> a method call to perform the actual lookup, making is possible to affect
>> that
>> lookup both for normal attribute access and access through the `super
>> proxy`_.
>>
>> Note that specific classes can already completely override the default
>> behaviour by implementing their own ``__getattribute__`` slot (with or
>> without
>> calling the super class implementation).
>>
>>
>> In Python code
>> --------------
>>
>> A meta type can define a method ``__getdescriptor__`` that is called during
>> attribute resolution by both ``super.__getattribute__``
>> and ``object.__getattribute``::
>>
>> class MetaType(type):
>> def __getdescriptor__(cls, name):
>> try:
>> return cls.__dict__[name]
>> except KeyError:
>> raise AttributeError(name) from None
>>
>> The ``__getdescriptor__`` method has as its arguments a class (which is an
>> instance of the meta type) and the name of the attribute that is looked up.
>> It should return the value of the attribute without invoking descriptors,
>> and should raise `AttributeError`_ when the name cannot be found.
>>
>> The `type`_ class provides a default implementation for
>> ``__getdescriptor__``,
>> that looks up the name in the class dictionary.
>>
>> Example usage
>> .............
>>
>> The code below implements a silly metaclass that redirects attribute lookup
>> to
>> uppercase versions of names::
>>
>> class UpperCaseAccess (type):
>> def __getdescriptor__(cls, name):
>> try:
>> return cls.__dict__[name.upper()]
>> except KeyError:
>> raise AttributeError(name) from None
>>
>> class SillyObject (metaclass=UpperCaseAccess):
>> def m(self):
>> return 42
>>
>> def M(self):
>> return "fourtytwo"
>>
>> obj = SillyObject()
>> assert obj.m() == "fortytwo"
>>
>> As mentioned earlier in this PEP a more realistic use case of this
>> functionallity is a ``__getdescriptor__`` method that dynamicly populates
>> the
>> class ``__dict__`` based on attribute access, primarily when it is not
>> possible to reliably keep the class dict in sync with its source, for
>> example
>> because the source used to populate ``__dict__`` is dynamic as well and does
>> not have triggers that can be used to detect changes to that source.
>>
>> An example of that are the class bridges in PyObjC: the class bridge is a
>> Python object (class) that represents an Objective-C class and conceptually
>> has a Python method for every Objective-C method in the Objective-C class.
>> As with Python it is possible to add new methods to an Objective-C class, or
>> replace existing ones, and there are no callbacks that can be used to detect
>> this.
>>
>> In C code
>> ---------
>>
>> A new slot ``tp_getdescriptor`` is added to the ``PyTypeObject`` struct,
>> this
>> slot corresponds to the ``__getdescriptor__`` method on `type`_.
>>
>> The slot has the following prototype::
>>
>> PyObject* (*getdescriptorfunc)(PyTypeObject* cls, PyObject* name);
>>
>> This method should lookup *name* in the namespace of *cls*, without looking
>> at
>> superclasses, and should not invoke descriptors. The method returns ``NULL``
>> without setting an exception when the *name* cannot be found, and returns a
>> new reference otherwise (not a borrowed reference).
>>
>> Use of this hook by the interpreter
>> -----------------------------------
>>
>> The new method is required for metatypes and as such is defined on `type_`.
>> Both ``super.__getattribute__`` and
>> ``object.__getattribute__``/`PyObject_GenericGetAttr`_
>> (through ``_PyType_Lookup``) use the this ``__getdescriptor__`` method when
>> walking the MRO.
>>
>> Other changes to the implementation
>> -----------------------------------
>>
>> The change for `PyObject_GenericGetAttr`_ will be done by changing the
>> private
>> function ``_PyType_Lookup``. This currently returns a borrowed reference,
>> but
>> must return a new reference when the ``__getdescriptor__`` method is
>> present.
>> Because of this ``_PyType_Lookup`` will be renamed to
>> ``_PyType_LookupName``,
>> this will cause compile-time errors for all out-of-tree users of this
>> private API.
>>
>> The attribute lookup cache in ``Objects/typeobject.c`` is disabled for
>> classes
>> that have a metaclass that overrides ``__getdescriptor__``, because using
>> the
>> cache might not be valid for such classes.
>>
>> Impact of this PEP on introspection
>> -----------------------------------
>>
>> Use of the method introduced in this PEP can affect introspection of classes
>> with a metaclass that uses a custom ``__getdescriptor__`` method. This
>> section
>> lists those changes.
>>
>> The items listed below are only affected by custom ``__getdescriptor__``
>> methods, the default implementation for ``object`` won't cause problems
>> because that still only uses the class ``__dict__`` and won't cause visible
>> changes to the visible behaviour of the ``object.__getattribute__``.
>>
>> * ``dir`` might not show all attributes
>>
>> As with a custom ``__getattribute__`` method `dir()`_ might not see all
>> (instance) attributes when using the ``__getdescriptor__()`` method to
>> dynamicly resolve attributes.
>>
>> The solution for that is quite simple: classes using ``__getdescriptor__``
>> should also implement `__dir__()`_ if they want full support for the
>> builtin
>> `dir()`_ function.
>>
>> * ``inspect.getattr_static`` might not show all attributes
>>
>> The function ``inspect.getattr_static`` intentionally does not invoke
>> ``__getattribute__`` and descriptors to avoid invoking user code during
>> introspection with this function. The ``__getdescriptor__`` method will
>> also
>> be ignored and is another way in which the result of
>> ``inspect.getattr_static``
>> can be different from that of ``builtin.getattr``.
>>
>> * ``inspect.getmembers`` and ``inspect.get_class_attrs``
>>
>> Both of these functions directly access the class __dict__ of classes
>> along
>> the MRO, and hence can be affected by a custom ``__getdescriptor__``
>> method.
>>
>> **TODO**: I haven't fully worked out what the impact of this is, and if
>> there
>> are mitigations for those using either updates to these functions, or
>> additional methods that users should implement to be fully compatible with
>> these functions.
>>
>> One possible mitigation is to have a custom ``__getattribute__`` for these
>> classes that fills ``__dict__`` before returning and and defers to the
>> default implementation for other attributes.
>>
>> * Direct introspection of the class ``__dict__``
>>
>> Any code that directly access the class ``__dict__`` for introspection
>> can be affected by a custom ``__getdescriptor__`` method.
>>
>>
>> Performance impact
>> ------------------
>>
>> **WARNING**: The benchmark results in this section are old, and will be
>> updated
>> when I've ported the patch to the current trunk. I don't expect significant
>> changes to the results in this section.
>>
>> [snipped]
>>
>>
>> Alternative proposals
>> ---------------------
>>
>> ``__getattribute_super__``
>> ..........................
>>
>> An earlier version of this PEP used the following static method on classes::
>>
>> def __getattribute_super__(cls, name, object, owner): pass
>>
>> This method performed name lookup as well as invoking descriptors and was
>> necessarily limited to working only with ``super.__getattribute__``.
>>
>>
>> Reuse ``tp_getattro``
>> .....................
>>
>> It would be nice to avoid adding a new slot, thus keeping the API simpler
>> and
>> easier to understand. A comment on `Issue 18181`_ asked about reusing the
>> ``tp_getattro`` slot, that is super could call the ``tp_getattro`` slot of
>> all
>> methods along the MRO.
>>
>> That won't work because ``tp_getattro`` will look in the instance
>> ``__dict__`` before it tries to resolve attributes using classes in the MRO.
>> This would mean that using ``tp_getattro`` instead of peeking the class
>> dictionaries changes the semantics of the `super class`_.
>>
>> Alternate placement of the new method
>> .....................................
>>
>> This PEP proposes to add ``__getdescriptor__`` as a method on the metaclass.
>> An alternative would be to add it as a class method on the class itself
>> (simular to how ``__new__`` is a `staticmethod`_ of the class and not a
>> method
>> of the metaclass).
>>
>> The two are functionally equivalent, and there's something to be said about
>> not requiring the use of a meta class.
>>
>>
>> References
>> ==========
>>
>> * `Issue 18181`_ contains an out of date prototype implementation
>>
>> Copyright
>> =========
>>
>> This document has been placed in the public domain.
>>
>> .. _`Issue 18181`: http://bugs.python.org/issue18181
>>
>> .. _`super class`: http://docs.python.org/3/library/functions.html#super
>>
>> .. _`super proxy`: http://docs.python.org/3/library/functions.html#super
>>
>> .. _`super`: http://docs.python.org/3/library/functions.html#super
>>
>> .. _`dir()`: http://docs.python.org/3/library/functions.html#dir
>>
>> .. _`staticmethod`:
>> http://docs.python.org/3/library/functions.html#staticmethod
>>
>> .. _`__dir__()`:
>> https://docs.python.org/3/reference/datamodel.html#object.__dir__
>>
>> .. _`NotImplemented`:
>> http://docs.python.org/3/library/constants.html#NotImplemented
>>
>> .. _`PyObject_GenericGetAttr`:
>> http://docs.python.org/3/c-api/object.html#PyObject_GenericGetAttr
>>
>> .. _`type`: http://docs.python.org/3/library/functions.html#type
>>
>> .. _`AttributeError`:
>> http://docs.python.org/3/library/exceptions.html#AttributeError
>>
>> .. _`PyObjC`: http://pyobjc.sourceforge.net/
>>
>> .. _`classmethod`:
>> http://docs.python.org/3/library/functions.html#classmethod
>>
>> _______________________________________________
>> Python-Dev mailing list
>> Python-Dev at python.org
>> https://mail.python.org/mailman/listinfo/python-dev
>> Unsubscribe:
>> https://mail.python.org/mailman/options/python-dev/dimaqq%40gmail.com
>>
_______________________________________________
Python-Dev mailing list
Python-Dev at python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: https://mail.python.org/mailman/options/python-dev/steve.dower%40microsoft.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20150723/beee9ac2/attachment-0001.html>
More information about the Python-Dev
mailing list