PEP487: Simpler customization of class creation
Hi list, so this is the next round for PEP 487. During the last round, most of the comments were in the direction that a two step approach for integrating into Python, first in pure Python, later in C, was not a great idea and everything should be in C directly. So I implemented it in C, put it onto the issue tracker here: http://bugs.python.org/issue27366, and also modified the PEP accordingly. For those who had not been in the discussion, PEP 487 proposes to add two hooks, __init_subclass__ which is a classmethod called whenever a class is subclassed, and __set_owner__, a hook in descriptors which gets called once the class the descriptor is part of is created. While implementing PEP 487 I realized that there is and oddity in the type base class: type.__init__ forbids to use keyword arguments, even for the usual three arguments it has (name, base and dict), while type.__new__ allows for keyword arguments. As I plan to forward any keyword arguments to the new __init_subclass__, I stumbled over that. As I write in the PEP, I think it would be a good idea to forbid using keyword arguments for type.__new__ as well. But if people think this would be to big of a change, it would be possible to do it differently. Hoping for good comments Greetings Martin The PEP follows: PEP: 487 Title: Simpler customisation of class creation Version: $Revision$ Last-Modified: $Date$ Author: Martin Teichmann <lkb.teichmann@gmail.com>, Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 27-Feb-2015 Python-Version: 3.6 Post-History: 27-Feb-2015, 5-Feb-2016, 24-Jun-2016, 2-Jul-2016 Replaces: 422 Abstract ======== Currently, customising class creation requires the use of a custom metaclass. This custom metaclass then persists for the entire lifecycle of the class, creating the potential for spurious metaclass conflicts. This PEP proposes to instead support a wide range of customisation scenarios through a new ``__init_subclass__`` hook in the class body, and a hook to initialize attributes. The new mechanism should be easier to understand and use than implementing a custom metaclass, and thus should provide a gentler introduction to the full power Python's metaclass machinery. Background ========== Metaclasses are a powerful tool to customize class creation. They have, however, the problem that there is no automatic way to combine metaclasses. If one wants to use two metaclasses for a class, a new metaclass combining those two needs to be created, typically manually. This need often occurs as a surprise to a user: inheriting from two base classes coming from two different libraries suddenly raises the necessity to manually create a combined metaclass, where typically one is not interested in those details about the libraries at all. This becomes even worse if one library starts to make use of a metaclass which it has not done before. While the library itself continues to work perfectly, suddenly every code combining those classes with classes from another library fails. Proposal ======== While there are many possible ways to use a metaclass, the vast majority of use cases falls into just three categories: some initialization code running after class creation, the initalization of descriptors and keeping the order in which class attributes were defined. The first two categories can easily be achieved by having simple hooks into the class creation: 1. An ``__init_subclass__`` hook that initializes all subclasses of a given class. 2. upon class creation, a ``__set_owner__`` hook is called on all the attribute (descriptors) defined in the class, and The third category is the topic of another PEP 520. As an example, the first use case looks as follows::
class SpamBase: ... # this is implicitly a @classmethod ... def __init_subclass__(cls, **kwargs): ... cls.class_args = kwargs ... super().__init_subclass__(cls, **kwargs)
class Spam(SpamBase, a=1, b="b"): ... pass
Spam.class_args {'a': 1, 'b': 'b'}
The base class ``object`` contains an empty ``__init_subclass__`` method which serves as an endpoint for cooperative multiple inheritance. Note that this method has no keyword arguments, meaning that all methods which are more specialized have to process all keyword arguments. This general proposal is not a new idea (it was first suggested for inclusion in the language definition `more than 10 years ago`_, and a similar mechanism has long been supported by `Zope's ExtensionClass`_), but the situation has changed sufficiently in recent years that the idea is worth reconsidering for inclusion. The second part of the proposal adds an ``__set_owner__`` initializer for class attributes, especially if they are descriptors. Descriptors are defined in the body of a class, but they do not know anything about that class, they do not even know the name they are accessed with. They do get to know their owner once ``__get__`` is called, but still they do not know their name. This is unfortunate, for example they cannot put their associated value into their object's ``__dict__`` under their name, since they do not know that name. This problem has been solved many times, and is one of the most important reasons to have a metaclass in a library. While it would be easy to implement such a mechanism using the first part of the proposal, it makes sense to have one solution for this problem for everyone. To give an example of its usage, imagine a descriptor representing weak referenced values:: import weakref class WeakAttribute: def __get__(self, instance, owner): return instance.__dict__[self.name] def __set__(self, instance, value): instance.__dict__[self.name] = weakref.ref(value) # this is the new initializer: def __set_owner__(self, owner, name): self.name = name While this example looks very trivial, it should be noted that until now such an attribute cannot be defined without the use of a metaclass. And given that such a metaclass can make life very hard, this kind of attribute does not exist yet. Initializing descriptors could simply be done in the ``__init_subclass__`` hook. But this would mean that descriptors can only be used in classes that have the proper hook, the generic version like in the example would not work generally. One could also call ``__set_owner__`` from whithin the base implementation of ``object.__init_subclass__``. But given that it is a common mistake to forget to call ``super()``, it would happen too often that suddenly descriptors are not initialized. Key Benefits ============ Easier inheritance of definition time behaviour ----------------------------------------------- Understanding Python's metaclasses requires a deep understanding of the type system and the class construction process. This is legitimately seen as challenging, due to the need to keep multiple moving parts (the code, the metaclass hint, the actual metaclass, the class object, instances of the class object) clearly distinct in your mind. Even when you know the rules, it's still easy to make a mistake if you're not being extremely careful. Understanding the proposed implicit class initialization hook only requires ordinary method inheritance, which isn't quite as daunting a task. The new hook provides a more gradual path towards understanding all of the phases involved in the class definition process. Reduced chance of metaclass conflicts ------------------------------------- One of the big issues that makes library authors reluctant to use metaclasses (even when they would be appropriate) is the risk of metaclass conflicts. These occur whenever two unrelated metaclasses are used by the desired parents of a class definition. This risk also makes it very difficult to *add* a metaclass to a class that has previously been published without one. By contrast, adding an ``__init_subclass__`` method to an existing type poses a similar level of risk to adding an ``__init__`` method: technically, there is a risk of breaking poorly implemented subclasses, but when that occurs, it is recognised as a bug in the subclass rather than the library author breaching backwards compatibility guarantees. New Ways of Using Classes ========================= This proposal has many usecases like the following. In the examples, we still inherit from the ``SubclassInit`` base class. This would become unnecessary once this PEP is included in Python directly. Subclass registration --------------------- Especially when writing a plugin system, one likes to register new subclasses of a plugin baseclass. This can be done as follows:: class PluginBase(Object): subclasses = [] def __init_subclass__(cls, **kwargs): super().__init_subclass__(**kwargs) cls.subclasses.append(cls) In this example, ``PluginBase.subclasses`` will contain a plain list of all subclasses in the entire inheritance tree. One should note that this also works nicely as a mixin class. Trait descriptors ----------------- There are many designs of Python descriptors in the wild which, for example, check boundaries of values. Often those "traits" need some support of a metaclass to work. This is how this would look like with this PEP:: class Trait: def __get__(self, instance, owner): return instance.__dict__[self.key] def __set__(self, instance, value): instance.__dict__[self.key] = value def __set_owner__(self, owner, name): self.key = name Implementation Details ====================== For those who prefer reading Python over english, the following is a Python equivalent of the C API changes proposed in this PEP, where the new ``object`` and ``type`` defined here inherit from the usual ones:: import types class type(type): def __new__(cls, *args, **kwargs): if len(args) == 1: return super().__new__(cls, args[0]) name, bases, ns = args init = ns.get('__init_subclass__') if isinstance(init, types.FunctionType): ns['__init_subclass__'] = classmethod(init) self = super().__new__(cls, name, bases, ns) for k, v in self.__dict__.items(): func = getattr(v, '__set_owner__', None) if func is not None: func(self, k) super(self, self).__init_subclass__(**kwargs) return self def __init__(self, name, bases, ns, **kwargs): super().__init__(name, bases, ns) class object: @classmethod def __init_subclass__(cls): pass class object(object, metaclass=type): pass In this code, first the ``__set_owner__`` are called on the descriptors, and then the ``__init_subclass__``. This means that subclass initializers already see the fully initialized descriptors. This way, ``__init_subclass__`` users can fix all descriptors again if this is needed. Another option would have been to call ``__set_owner__`` in the base implementation of ``object.__init_subclass__``. This way it would be possible event to prevent ``__set_owner__`` from being called. Most of the times, however, such a prevention would be accidental, as it often happens that a call to ``super()`` is forgotten. Another small change should be noted here: in the current implementation of CPython, ``type.__init__`` explicitly forbids the use of keyword arguments, while ``type.__new__`` allows for its attributes to be shipped as keyword arguments. This is weirdly incoherent, and thus the above code forbids that. While it would be possible to retain the current behavior, it would be better if this was fixed, as it is probably not used at all: the only use case would be that at metaclass calls its ``super().__new__`` with *name*, *bases* and *dict* (yes, *dict*, not *namespace* or *ns* as mostly used with modern metaclasses) as keyword arguments. This should not be done. As a second change, the new ``type.__init__`` just ignores keyword arguments. Currently, it insists that no keyword arguments are given. This leads to a (wanted) error if one gives keyword arguments to a class declaration if the metaclass does not process them. Metaclass authors that do want to accept keyword arguments must filter them out by overriding ``__init___``. In the new code, it is not ``__init__`` that complains about keyword arguments, but ``__init_subclass__``, whose default implementation takes no arguments. In a classical inheritance scheme using the method resolution order, each ``__init_subclass__`` may take out it's keyword arguments until none are left, which is checked by the default implementation of ``__init_subclass__``. Rejected Design Options ======================= Calling the hook on the class itself ------------------------------------ Adding an ``__autodecorate__`` hook that would be called on the class itself was the proposed idea of PEP 422. Most examples work the same way or even better if the hook is called on the subclass. In general, it is much easier to explicitly call the hook on the class in which it is defined (to opt-in to such a behavior) than to opt-out, meaning that one does not want the hook to be called on the class it is defined in. This becomes most evident if the class in question is designed as a mixin: it is very unlikely that the code of the mixin is to be executed for the mixin class itself, as it is not supposed to be a complete class on its own. The original proposal also made major changes in the class initialization process, rendering it impossible to back-port the proposal to older Python versions. More importantly, having a pure Python implementation allows us to take two preliminary steps before before we actually change the interpreter, giving us the chance to iron out all possible wrinkles in the API. Other variants of calling the hook ---------------------------------- Other names for the hook were presented, namely ``__decorate__`` or ``__autodecorate__``. This proposal opts for ``__init_subclass__`` as it is very close to the ``__init__`` method, just for the subclass, while it is not very close to decorators, as it does not return the class. Requiring an explicit decorator on ``__init_subclass__`` -------------------------------------------------------- One could require the explicit use of ``@classmethod`` on the ``__init_subclass__`` decorator. It was made implicit since there's no sensible interpretation for leaving it out, and that case would need to be detected anyway in order to give a useful error message. This decision was reinforced after noticing that the user experience of defining ``__prepare__`` and forgetting the ``@classmethod`` method decorator is singularly incomprehensible (particularly since PEP 3115 documents it as an ordinary method, and the current documentation doesn't explicitly say anything one way or the other). A more ``__new__``-like hook ---------------------------- In PEP 422 the hook worked more like the ``__new__`` method than the ``__init__`` method, meaning that it returned a class instead of modifying one. This allows a bit more flexibility, but at the cost of much harder implementation and undesired side effects. Adding a class attribute with the attribute order ------------------------------------------------- This got its own PEP 520. History ======= This used to be a competing proposal to PEP 422 by Nick Coghlan and Daniel Urban. PEP 422 intended to achieve the same goals as this PEP, but with a different way of implementation. In the meantime, PEP 422 has been withdrawn favouring this approach. References ========== .. _more than 10 years ago: http://mail.python.org/pipermail/python-dev/2001-November/018651.html .. _Zope's ExtensionClass: http://docs.zope.org/zope_secrets/extensionclass.html Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End:
On 2 July 2016 at 10:50, Martin Teichmann <lkb.teichmann@gmail.com> wrote:
Hi list,
so this is the next round for PEP 487. During the last round, most of the comments were in the direction that a two step approach for integrating into Python, first in pure Python, later in C, was not a great idea and everything should be in C directly. So I implemented it in C, put it onto the issue tracker here: http://bugs.python.org/issue27366, and also modified the PEP accordingly.
For those who had not been in the discussion, PEP 487 proposes to add two hooks, __init_subclass__ which is a classmethod called whenever a class is subclassed, and __set_owner__, a hook in descriptors which gets called once the class the descriptor is part of is created.
I'm +1 for this part of the proposal. One potential documentation issue is that __init_subclass__ adds yet a third special magic method behaviour: - __new__ is implicitly a static method - __prepare__ isn't implicitly anything (but in hindsight should have implicitly been a class method) - __init_subclass__ is implicitly a class method I think making __init_subclass__ implicitly a class method is still the right thing to do if this proposal gets accepted, we'll just want to see if we can do something to tidy up that aspect of the documentation at the same time.
While implementing PEP 487 I realized that there is and oddity in the type base class: type.__init__ forbids to use keyword arguments, even for the usual three arguments it has (name, base and dict), while type.__new__ allows for keyword arguments. As I plan to forward any keyword arguments to the new __init_subclass__, I stumbled over that. As I write in the PEP, I think it would be a good idea to forbid using keyword arguments for type.__new__ as well. But if people think this would be to big of a change, it would be possible to do it differently.
I *think* I'm in favour of cleaning this up, but I also think the explanation of the problem with the status quo could stand to be clearer, as could the proposed change in behaviour. Some example code at the interactive prompt may help with that. Positional arguments already either work properly, or give a helpful error message: >>> type("Example", (), {}) <class '__main__.Example'> >>> type.__new__("Example", (), {}) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: type.__new__(X): X is not a type object (str) >>> type.__new__(type, "Example", (), {}) <class '__main__.Example'> >>> type.__init__("Example", (), {}) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: descriptor '__init__' requires a 'type' object but received a 'str' >>> type.__init__(type, "Example", (), {}) By contrast, attempting to use keyword arguments is a fair collection of implementation defined "Uh, what just happened?": >>> type(name="Example", bases=(), dict={}) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: type.__init__() takes no keyword arguments >>> type.__new__(name="Example", bases=(), dict={}) # Huh? Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: type.__new__(): not enough arguments >>> type.__new__(type, name="Example", bases=(), dict={}) <class '__main__.Example'> >>> type.__init__(name="Example", bases=(), dict={}) # Huh? Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: descriptor '__init__' of 'type' object needs an argument >>> type.__init__(type, name="Example", bases=(), dict={}) # Huh? Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: type.__init__() takes no keyword arguments I think the PEP could be accepted without cleaning this up, though - it would just mean __init_subclass__ would see the "name", "bases" and "dict" keys when someone attempted to use keyword arguments with the dynamic type creation APIs. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
Hi Nick, thanks for the nice review!
I think making __init_subclass__ implicitly a class method is still the right thing to do if this proposal gets accepted, we'll just want to see if we can do something to tidy up that aspect of the documentation at the same time.
I could write some documentation, I just don't know where to put it. I personally have no strong feelings whether __init_subclass__ should be implicitly a @classmethod or not - but as the general consensus here seemed to hint making it implicit is better, this is how I wrote it.
While implementing PEP 487 I realized that there is and oddity in the type base class: type.__init__ forbids to use keyword arguments, even for the usual three arguments it has (name, base and dict), while type.__new__ allows for keyword arguments. As I plan to forward any keyword arguments to the new __init_subclass__, I stumbled over that. As I write in the PEP, I think it would be a good idea to forbid using keyword arguments for type.__new__ as well. But if people think this would be to big of a change, it would be possible to do it differently.
[some discussion cut out]
I think the PEP could be accepted without cleaning this up, though - it would just mean __init_subclass__ would see the "name", "bases" and "dict" keys when someone attempted to use keyword arguments with the dynamic type creation APIs.
Yes, this would be possible, albeit a bit ugly. I'm not so sure whether backwards compatibility is so important in this case. It is very easy to change the code to the fully cleaned up version Looking through old stuff I found http://bugs.python.org/issue23722, which describes the following problem: at the time __init_subclass__ is called, super() doesn't work yet for the new class. It does work for __init_subclass__, because it is called on the base class, but not for calls to other classmethods it does. This is a pity especially because also the two argument form of super() cannot be used as the new class has no name yet. The problem is solvable though. The initializations necessary for super() to work properly simply should be moved before the call to __init_subclass__. I implemented that by putting a new attribute into the class's namespace to keep the cell which will later be used by super(). This new attribute would be remove by type.__new__ again, but transiently it would be visible. This technique has already been used for __qualname__. The issue contains a patch that fixes that behavior, and back in the day you proposed I add the problem to the PEP. Should I? Greetings Martin
On Sat, Jul 2, 2016 at 10:50 AM, Martin Teichmann <lkb.teichmann@gmail.com> wrote:
Hi list,
so this is the next round for PEP 487. During the last round, most of the comments were in the direction that a two step approach for integrating into Python, first in pure Python, later in C, was not a great idea and everything should be in C directly. So I implemented it in C, put it onto the issue tracker here: http://bugs.python.org/issue27366, and also modified the PEP accordingly.
Thanks! Reviewing inline below.
For those who had not been in the discussion, PEP 487 proposes to add two hooks, __init_subclass__ which is a classmethod called whenever a class is subclassed, and __set_owner__, a hook in descriptors which gets called once the class the descriptor is part of is created.
While implementing PEP 487 I realized that there is and oddity in the type base class: type.__init__ forbids to use keyword arguments, even for the usual three arguments it has (name, base and dict), while type.__new__ allows for keyword arguments. As I plan to forward any keyword arguments to the new __init_subclass__, I stumbled over that. As I write in the PEP, I think it would be a good idea to forbid using keyword arguments for type.__new__ as well. But if people think this would be to big of a change, it would be possible to do it differently.
This is an area of exceeding subtlety (and also not very well documented/specified, probably). I'd worry that changing anything here might break some code. When a metaclass overrides neither __init__ nor __new__, keyword args will not work because type.__init__ forbids them. However when a metaclass overrides them and calls them using super(), it's quite possible that someone ended up calling super().__init__() with three positional args but super().__new__() with keyword args, since the call sites are distinct (in the overrides for __init__ and __new__ respectively). What's your argument for changing this, apart from a desire for more regularity?
Hoping for good comments
Greetings
Martin
The PEP follows:
PEP: 487 Title: Simpler customisation of class creation Version: $Revision$ Last-Modified: $Date$ Author: Martin Teichmann <lkb.teichmann@gmail.com>, Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 27-Feb-2015 Python-Version: 3.6 Post-History: 27-Feb-2015, 5-Feb-2016, 24-Jun-2016, 2-Jul-2016 Replaces: 422
Abstract ========
Currently, customising class creation requires the use of a custom metaclass. This custom metaclass then persists for the entire lifecycle of the class, creating the potential for spurious metaclass conflicts.
This PEP proposes to instead support a wide range of customisation scenarios through a new ``__init_subclass__`` hook in the class body, and a hook to initialize attributes.
The new mechanism should be easier to understand and use than implementing a custom metaclass, and thus should provide a gentler introduction to the full power Python's metaclass machinery.
Background ==========
Metaclasses are a powerful tool to customize class creation. They have, however, the problem that there is no automatic way to combine metaclasses. If one wants to use two metaclasses for a class, a new metaclass combining those two needs to be created, typically manually.
This need often occurs as a surprise to a user: inheriting from two base classes coming from two different libraries suddenly raises the necessity to manually create a combined metaclass, where typically one is not interested in those details about the libraries at all. This becomes even worse if one library starts to make use of a metaclass which it has not done before. While the library itself continues to work perfectly, suddenly every code combining those classes with classes from another library fails.
Proposal ========
While there are many possible ways to use a metaclass, the vast majority of use cases falls into just three categories: some initialization code running after class creation, the initalization of descriptors and keeping the order in which class attributes were defined.
The first two categories can easily be achieved by having simple hooks into the class creation:
1. An ``__init_subclass__`` hook that initializes all subclasses of a given class. 2. upon class creation, a ``__set_owner__`` hook is called on all the attribute (descriptors) defined in the class, and
The third category is the topic of another PEP 520.
As an example, the first use case looks as follows::
class SpamBase: ... # this is implicitly a @classmethod ... def __init_subclass__(cls, **kwargs): ... cls.class_args = kwargs ... super().__init_subclass__(cls, **kwargs)
class Spam(SpamBase, a=1, b="b"): ... pass
Spam.class_args {'a': 1, 'b': 'b'}
The base class ``object`` contains an empty ``__init_subclass__`` method which serves as an endpoint for cooperative multiple inheritance. Note that this method has no keyword arguments, meaning that all methods which are more specialized have to process all keyword arguments.
I'm confused. In the above example it would seem that the keyword args {'a': 1, 'b': 2} are passed right on to super9).__init_subclass__(). Do you mean that it ignores all keyword args? Or that it has no positional args? (Both of which would be consistent with the example.)
This general proposal is not a new idea (it was first suggested for inclusion in the language definition `more than 10 years ago`_, and a similar mechanism has long been supported by `Zope's ExtensionClass`_), but the situation has changed sufficiently in recent years that the idea is worth reconsidering for inclusion.
Can you state exactly at which point during class initialization __init_class__() is called? (Surely by now, having implemented it, you know exactly where. :-) [This is as far as I got reviewing when the weekend activities interrupted me. In the light of ongoing discussion I'm posting this now -- I'll continue later.] -- --Guido van Rossum (python.org/~guido)
This is an area of exceeding subtlety (and also not very well documented/specified, probably). I'd worry that changing anything here might break some code. When a metaclass overrides neither __init__ nor __new__, keyword args will not work because type.__init__ forbids them. However when a metaclass overrides them and calls them using super(), it's quite possible that someone ended up calling super().__init__() with three positional args but super().__new__() with keyword args, since the call sites are distinct (in the overrides for __init__ and __new__ respectively).
What's your argument for changing this, apart from a desire for more regularity?
The implementation gets much simpler if __new__ doesn't take keyword arguments. It's simply that if it does, I have to filter out __new__'s three arguments. That's easily done in Python, unfortunately not so much in C. So we have two options: either type.__new__ is limited to accepting positional arguments only, possibly breaking some code, but which could be changed easily. This leads to a pretty simple implementation: pass over keyword arguments to __init_subclass__, that's it. The other option is: filter out name, bases and dict from the keyword arguments If people think that backwards compatibility is that important, I could do that. But that just leaves quite some code in places where there is already a lot of complicated code. Nick proposed a compromise, just don't filter for name, bases and dict, and pass them over to __init_subclass__. Then the default implementation of __init_subclass__ must support those three keyword arguments and do nothing with them. I'm fine with all three solutions, although I have a preference for the first. I think passing keyword arguments to type.__new__ is already really rare and if it does exist, it's super easy to fix.
I'm confused. In the above example it would seem that the keyword args {'a': 1, 'b': 2} are passed right on to super9).__init_subclass__(). Do you mean that it ignores all keyword args? Or that it has no positional args? (Both of which would be consistent with the example.)
The example is just wrong. I'll fix it.
Can you state exactly at which point during class initialization __init_class__() is called? (Surely by now, having implemented it, you know exactly where. :-)
Further down in the PEP I give the exact
[This is as far as I got reviewing when the weekend activities interrupted me. In the light of ongoing discussion I'm posting this now -- I'll continue later.]
I hope you had a good weekend not thinking too much about metaclasses... Greetings Martin
participants (3)
-
Guido van Rossum
-
Martin Teichmann
-
Nick Coghlan