[Python-ideas] Simpler Customization of Class Creation - PEP 487

Nick Coghlan ncoghlan at gmail.com
Sat Feb 6 01:44:12 EST 2016


On 6 February 2016 at 07:20, Martin Teichmann <lkb.teichmann at gmail.com> wrote:
> Hi List,
>
> about a year ago I started a discussion on how to simplify metaclasses,
> which led to PEP 487. I got some good ideas from this list, but couldn't
> follow up on this because I was bound in other projects.

Thanks for taking this up again!

> In short, metaclasses are often not used as they are considered very
> complicated. Indeed they are, especially if you need to use two of them
> at the same time in a multiple inheritance context.
>
> Most metaclasses, however, serve only some of the following three
> purposes: a) run some code after a class is created b) initialize descriptors
> of a class or c) keep the order in which class attributes have been defined.
>
> PEP 487 now proposes to put a metaclass into the standard library, which
> can be used for all those three purposes. If now libraries start to use this
> metaclass, we won't need any metaclass mixing anymore.
>
> What has changed since the last time I posted PEP 487? Firstly, I re-wrote
> large parts of the PEP to make it easier to read. Those who liked the
> old text, that's still existing in PEP 422.
>
> Secondly, I modified the proposal following suggestions from this list:
> I added the descriptor initialization (purpose b)), as this was considered
> particularly useful, even if it could in principle be done using purpose a) from
> above. The order-keeping of the class attributes is the leftover from a much
> more ambitious previous idea that would have allowed for custom namespaces
> during class creation. But this additional feature would have rendered the
> most common usecase - getting the order of attributes - much more
> complicated, so I opted for usability over flexibility.

I like this updated approach in general - more detailed comments are
inline below.

> I have put the new version of the PEP here:
>
> https://github.com/tecki/metaclasses/blob/pep487/pep-0487.txt

I also just pushed this version to the PEPs repo.

> Proposal
> ========
>
> While there are many possible ways to use a metaclass, the vast majority
> of use cases falls into just three categories: some initialization code
> running after class creation, the initalization of descriptors and
> keeping the order in which class attributes were defined.
>
> Those three use cases can easily be performed by just one metaclass. If
> this metaclass is put into the standard library, and all libraries that
> wish to customize class creation use this very metaclass, no combination
> of metaclasses is necessary anymore.

While you do cover it later, it's worth mentioning up front that
there's a reasonable case to be made that type should just work this
way by default. However, changing type *again* is difficult if we
decide we made a mistake, so the currently proposed plan is:

1. Introduce a PyPI package (metaclass) for initial iteration on the API
2. Introduce a stdlib metaclass as a provisional in Python 3.6
3. Consider this as possible default behaviour for type in Python 3.7.
If type changes, the old type name will just become a legacy alias for
type

Steps 2 & 3 would be similar to the way the set datatype was first
introduced as sets.Set, and only later made a builtin type (with a
slightly different API) based on wider experience with the sets
module.

Step 2 mostly serves as a signalling mechanism that unequivocally
blesses the PyPI module created in 1 as setting the future direction
of the default behaviour of the builtin "type".

> The three use cases are achieved as follows:
>
> 1. The metaclass contains an ``__init_subclass__`` hook that initializes
>    all subclasses of a given class,
> 2. the metaclass calls an ``__init_descriptor__`` hook for all descriptors
>    defined in the class, and

"__init_descriptor__" confused me, as it wasn't clear to me until much
later in the PEP that it's a proposed addition to the *descriptor*
API, rather than something you implement on the metaclass you're
defining. It's also not really restricted to descriptors - as a new
hook, it could be implemented by any attribute, regardless of whether
it supported any other part of the descriptor protocol.

As such, a possible way to go here is to instead call this a new
"attribute ownership protocol", and make the hook name
"__set_owner__". It should also be called out that implementations of
__set_owner__ will need to handle the case of attribute re-use, and
handle things appropriately if the owner has already been set (e.g. in
the simplest case, by throwing a RuntimeError indicating that shared
ownership isn't supported).

> 3. an ``__attribute_order__`` tuple is left in the class in order to inspect
>    the order in which attributes were defined.
>
> For ease of use, a base class ``SubclassInit`` is defined, which uses said
> metaclass and contains an empty stub for the hook described for use case 1.

You should specify the metaclass name here as well.

Given the three-fold difference in behaviour, naming the new metaclass
and class after only one of those behaviours seems misleading. On the
other hand, "OrderedAttributeOwningSubclassInitialisingMeta" would be
silly, so it might be worth instead calling them something like
"ProvisionalMeta" and "ProvisionalClass" - that is, you're opting in
to the provisional future behaviour of the "type" and "object"
builtins, without specifying exactly what the current differences are.

I'd also suggest putting the new types in the existing "types" module,
rather than defining a new module for them (aside from the module on
PyPI).

> As an example, the first use case looks as follows::
>
>    class SpamBase(SubclassInit):
>        # this is implicitly a @classmethod
>        def __init_subclass__(cls, **kwargs):
>            # This is invoked after a subclass is created, but before
>            # explicit decorators are called.
>            # The usual super() mechanisms are used to correctly support
>            # multiple inheritance.
>            # **kwargs are the keyword arguments to the subclasses'
>            # class creation statement
>            super().__init_subclass__(cls, **kwargs)
>
>    class Spam(SpamBase):
>        pass
>    # the new hook is called on Spam

This example isn't particularly clear, since the __init_subclass__
doesn't *do* anything. An example that preserves the class keyword
arguments as an attribute may be more comprehensible:

>>> class ExampleBase(metaclass.SubclassInit):
...     def __init_subclass__(cls, **kwds):
...         cls.class_args = kwds
...         super().__init_subclass__()
...
>>> class Example(ExampleBase, a=1, b=2, c=3):
...     pass
...
>>> Example.class_args
{'b': 2, 'a': 1, 'c': 3}


> The second part of the proposal adds an ``__init_descriptor__``
> initializer for descriptors.  Descriptors are defined in the body of a
> class, but they do not know anything about that class, they do not
> even know the name they are accessed with. They do get to know their
> owner once ``__get__`` is called, but still they do not know their
> name. This is unfortunate, for example they cannot put their
> associated value into their object's ``__dict__`` under their name,
> since they do not know that name.  This problem has been solved many
> times, and is one of the most important reasons to have a metaclass in
> a library. While it would be easy to implement such a mechanism using
> the first part of the proposal, it makes sense to have one solution
> for this problem for everyone.

This is the part I suggest renaming as an "attribute ownership
protocol", with the hook name as "__set_owner__".

> To give an example of its usage, imagine a descriptor representing weak
> referenced values (this is an insanely simplified, yet working example)::
>
>     import weakref
>
>     class WeakAttribute:
>         def __get__(self, instance, owner):
>             return instance.__dict__[self.name]
>
>         def __set__(self, instance, value):
>             instance.__dict__[self.name] = weakref.ref(value)
>
>         # this is the new initializer:
>         def __init_descriptor__(self, owner, name):
>             self.name = name

Similar to the __subclass_init__ case, a more meaningful usage example
may help here, such as allowing owning classes to define a hook that
gets called when the weak reference goes away, while still having
useful default behaviour. For example (untested):

    class WeakAttribute:
        def __init__(self):
            self._owner = None
            self._name = None
            self._callback = None

        def __get__(self, instance, owner):
            if instance is None:
                return self
            return instance.__dict__[self.name]

        def __set__(self, instance, value):
            instance.__dict__[self.name] = weakref.proxy(value, self._callback)

        def __set_owner__(self, owner, attr):
            if self._owner is not None:
                raise RuntimeError("{!r} already owned by
{!r}".format(self, self._owner())
            self._owner = weakref.ref(owner)
            self._name = attr
            callback = getattr(owner, "attribute_collected", None)
            if callback is not None:
                self._callback = functools.partial(callback, attr)

    class Example(metaclass.SubclassInit):
        proxy = WeakAttribute()
        def attribute_collected(self, attr):
            print("{} was garbage collected".format())

> The third part of the proposal is to leave a tuple called
> ``__attribute_order__`` in the class that contains the order in which
> the attributes were defined. This is a very common usecase, many
> libraries use an ``OrderedDict`` to store this order. This is a very
> simple way to achieve the same goal.

This should spell out the underlying mechanism here - the new
metaclass will *also* use OrderedDict to preserve the order during
class construction, so the extra bit the provisional metaclass adds
over the DIY __prepare__ method is taking the original ordered dicts
keys and saving them in an attribute, while __dict__ itself will
remain an ordinary unordered dict.


> New Ways of Using Classes
> =========================
>
> This proposal has many usecases like the following. In the examples,
> we still inherit from the ``SubclassInit`` base class. This would
> become unnecessary once this PEP is included in Python directly.
>
> Subclass registration
> ---------------------
>
> Especially when writing a plugin system, one likes to register new
> subclasses of a plugin baseclass. This can be done as follows::
>
>    class PluginBase(SubclassInit):
>        subclasses = []
>
>        def __init_subclass__(cls, **kwargs):
>            super().__init_subclass__(**kwargs)
>            cls.subclasses.append(cls)
>
> One should note that this also works nicely as a mixin class.

This should explain that the difference between this and just calling
PluginBase.__subclasses__() is that this example flattens the
inheritance tree into a simple list.

> Trait descriptors
> -----------------
>
> There are many designs of Python descriptors in the wild which, for
> example, check boundaries of values. Often those "traits" need some support
> of a metaclass to work. This is how this would look like with this
> PEP::
>
>    class Trait:
>        def __get__(self, instance, owner):
>            return instance.__dict__[self.key]
>
>        def __set__(self, instance, value):
>            instance.__dict__[self.key] = value
>
>        def __init_descriptor__(self, owner, name):
>            self.key = name
>
>    class Int(Trait):
>        def __set__(self, instance, value):
>            # some boundary check code here
>            super().__set__(instance, value)

This doesn't show the descriptor subclass making use of the state set
up by the new hook on the parent class, so the subclass ends up making
the example more confusing, rather than improving it.

> Rejected Design Options
> =======================
>
>
> Calling the hook on the class itself
> ------------------------------------
>
> Adding an ``__autodecorate__`` hook that would be called on the class
> itself was the proposed idea of PEP 422.  Most examples work the same
> way or even better if the hook is called on the subclass. In general,
> it is much easier to explicitly call the hook on the class in which it
> is defined (to opt-in to such a behavior) than to opt-out, meaning
> that one does not want the hook to be called on the class it is
> defined in.
>
> This becomes most evident if the class in question is designed as a
> mixin: it is very unlikely that the code of the mixin is to be
> executed for the mixin class itself, as it is not supposed to be a
> complete class on its own.
>
> The original proposal also made major changes in the class
> initialization process, rendering it impossible to back-port the
> proposal to older Python versions.

This should be elaborated on:

* we *do* want to change the default behaviour of type in the future
* we *also* want to be able to validate the usability of those changes
before we make them

Unlike PEP 422, this PEP lets us take two preliminary steps (library
on PyPI, provisional API in the standard library) *before* making any
changes to type itself.

> History
> =======
>
> This used to be a competing proposal to PEP 422 by Nick Coughlan and

Coghlan :)

> Daniel Urban. It shares both most of the PEP text and proposed code,

I think the code and text have diverged significantly now, but the
major shared aspect was always common *goals*, rather than any of the
technical details.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


More information about the Python-ideas mailing list