[Python-checkins] Create a primer section for the descriptor howto guide (GH-22906) (GH0-22918)

rhettinger webhook-mailer at python.org
Fri Oct 23 16:49:42 EDT 2020


https://github.com/python/cpython/commit/f8d96b98a48c97d0da8714e5e18675412774a6ef
commit: f8d96b98a48c97d0da8714e5e18675412774a6ef
branch: 3.9
author: Miss Skeleton (bot) <31488909+miss-islington at users.noreply.github.com>
committer: rhettinger <rhettinger at users.noreply.github.com>
date: 2020-10-23T13:49:32-07:00
summary:

Create a primer section for the descriptor howto guide (GH-22906) (GH0-22918)

files:
M Doc/glossary.rst
M Doc/howto/descriptor.rst
M Doc/tools/susp-ignored.csv

diff --git a/Doc/glossary.rst b/Doc/glossary.rst
index 9fdbdb1a83f28..189d49ee0d627 100644
--- a/Doc/glossary.rst
+++ b/Doc/glossary.rst
@@ -301,7 +301,8 @@ Glossary
       including functions, methods, properties, class methods, static methods,
       and reference to super classes.
 
-      For more information about descriptors' methods, see :ref:`descriptors`.
+      For more information about descriptors' methods, see :ref:`descriptors`
+      or the :ref:`Descriptor How To Guide <descriptorhowto>`.
 
    dictionary
       An associative array, where arbitrary keys are mapped to values.  The
diff --git a/Doc/howto/descriptor.rst b/Doc/howto/descriptor.rst
index b792b6c6ab77f..4a53b9e615692 100644
--- a/Doc/howto/descriptor.rst
+++ b/Doc/howto/descriptor.rst
@@ -1,3 +1,5 @@
+.. _descriptorhowto:
+
 ======================
 Descriptor HowTo Guide
 ======================
@@ -7,6 +9,415 @@ Descriptor HowTo Guide
 
 .. Contents::
 
+
+:term:`Descriptors <descriptor>` let objects customize attribute lookup,
+storage, and deletion.
+
+This HowTo guide has three major sections:
+
+1) The "primer" gives a basic overview, moving gently from simple examples,
+   adding one feature at a time.  It is a great place to start.
+
+2) The second section shows a complete, practical descriptor example.  If you
+   already know the basics, start there.
+
+3) The third section provides a more technical tutorial that goes into the
+   detailed mechanics of how descriptors work.  Most people don't need this
+   level of detail.
+
+
+Primer
+^^^^^^
+
+In this primer, we start with most basic possible example and then we'll add
+new capabilities one by one.
+
+
+Simple example: A descriptor that returns a constant
+----------------------------------------------------
+
+The :class:`Ten` class is a descriptor that always returns the constant ``10``::
+
+
+    class Ten:
+        def __get__(self, obj, objtype=None):
+            return 10
+
+To use the descriptor, it must be stored as a class variable in another class::
+
+    class A:
+        x = 5                       # Regular class attribute
+        y = Ten()                   # Descriptor
+
+An interactive session shows the difference between normal attribute lookup
+and descriptor lookup::
+
+    >>> a = A()                     # Make an instance of class A
+    >>> a.x                         # Normal attribute lookup
+    5
+    >>> a.y                         # Descriptor lookup
+    10
+
+In the ``a.x`` attribute lookup, the dot operator finds the value ``5`` stored
+in the class dictionary.  In the ``a.y`` descriptor lookup, the dot operator
+calls the descriptor's :meth:`__get__()` method.  That method returns ``10``.
+Note that the value ``10`` is not stored in either the class dictionary or the
+instance dictionary.  Instead, the value ``10`` is computed on demand.
+
+This example shows how a simple descriptor works, but it isn't very useful.
+For retrieving constants, normal attribute lookup would be better.
+
+In the next section, we'll create something more useful, a dynamic lookup.
+
+
+Dynamic lookups
+---------------
+
+Interesting descriptors typically run computations instead of doing lookups::
+
+
+    import os
+
+    class DirectorySize:
+
+        def __get__(self, obj, objtype=None):
+            return len(os.listdir(obj.dirname))
+
+    class Directory:
+
+        size = DirectorySize()              # Descriptor
+
+        def __init__(self, dirname):
+            self.dirname = dirname          # Regular instance attribute
+
+An interactive session shows that the lookup is dynamic — it computes
+different, updated answers each time::
+
+    >>> g = Directory('games')
+    >>> s = Directory('songs')
+    >>> g.size                              # The games directory has three files
+    3
+    >>> os.system('touch games/newfile')    # Add a fourth file to the directory
+    0
+    >>> g.size
+    4
+    >>> s.size                              # The songs directory has twenty files
+    20
+
+Besides showing how descriptors can run computations, this example also
+reveals the purpose of the parameters to :meth:`__get__`.  The *self*
+parameter is *size*, an instance of *DirectorySize*.  The *obj* parameter is
+either *g* or *s*, an instance of *Directory*.  It is *obj* parameter that
+lets the :meth:`__get__` method learn the target directory.  The *objtype*
+parameter is the class *Directory*.
+
+
+Managed attributes
+------------------
+
+A popular use for descriptors is managing access to instance data.  The
+descriptor is assigned to a public attribute in the class dictionary while the
+actual data is stored as a private attribute in the instance dictionary.  The
+descriptor's :meth:`__get__` and :meth:`__set__` methods are triggered when
+the public attribute is accessed.
+
+In the following example, *age* is the public attribute and *_age* is the
+private attribute.  When the public attribute is accessed, the descriptor logs
+the lookup or update::
+
+    import logging
+
+    logging.basicConfig(level=logging.INFO)
+
+    class LoggedAgeAccess:
+
+        def __get__(self, obj, objtype=None):
+            value = obj._age
+            logging.info('Accessing %r giving %r', 'age', value)
+            return value
+
+        def __set__(self, obj, value):
+            logging.info('Updating %r to %r', 'age', value)
+            obj._age = value
+
+    class Person:
+
+        age = LoggedAgeAccess()             # Descriptor
+
+        def __init__(self, name, age):
+            self.name = name                # Regular instance attribute
+            self.age = age                  # Calls the descriptor
+
+        def birthday(self):
+            self.age += 1                   # Calls both __get__() and __set__()
+
+
+An interactive session shows that all access to the managed attribute *age* is
+logged, but that the regular attribute *name* is not logged::
+
+    >>> mary = Person('Mary M', 30)         # The initial age update is logged
+    INFO:root:Updating 'age' to 30
+    >>> dave = Person('David D', 40)
+    INFO:root:Updating 'age' to 40
+
+    >>> vars(mary)                          # The actual data is in a private attribute
+    {'name': 'Mary M', '_age': 30}
+    >>> vars(dave)
+    {'name': 'David D', '_age': 40}
+
+    >>> mary.age                            # Access the data and log the lookup
+    INFO:root:Accessing 'age' giving 30
+    30
+    >>> mary.birthday()                     # Updates are logged as well
+    INFO:root:Accessing 'age' giving 30
+    INFO:root:Updating 'age' to 31
+
+    >>> dave.name                           # Regular attribute lookup isn't logged
+    'David D'
+    >>> dave.age                            # Only the managed attribute is logged
+    INFO:root:Accessing 'age' giving 40
+    40
+
+One major issue with this example is the private name *_age* is hardwired in
+the *LoggedAgeAccess* class.  That means that each instance can only have one
+logged attribute and that its name is unchangeable.  In the next example,
+we'll fix that problem.
+
+
+Customized Names
+----------------
+
+When a class uses descriptors, it can inform each descriptor about what
+variable name was used.
+
+In this example, the :class:`Person` class has two descriptor instances,
+*name* and *age*.  When the :class:`Person` class is defined, it makes a
+callback to :meth:`__set_name__` in *LoggedAccess* so that the field names can
+be recorded, giving each descriptor its own *public_name* and *private_name*::
+
+    import logging
+
+    logging.basicConfig(level=logging.INFO)
+
+    class LoggedAccess:
+
+        def __set_name__(self, owner, name):
+            self.public_name = name
+            self.private_name = f'_{name}'
+
+        def __get__(self, obj, objtype=None):
+            value = getattr(obj, self.private_name)
+            logging.info('Accessing %r giving %r', self.public_name, value)
+            return value
+
+        def __set__(self, obj, value):
+            logging.info('Updating %r to %r', self.public_name, value)
+            setattr(obj, self.private_name, value)
+
+    class Person:
+
+        name = LoggedAccess()                # First descriptor
+        age = LoggedAccess()                 # Second descriptor
+
+        def __init__(self, name, age):
+            self.name = name                 # Calls the first descriptor
+            self.age = age                   # Calls the second descriptor
+
+        def birthday(self):
+            self.age += 1
+
+An interactive session shows that the :class:`Person` class has called
+:meth:`__set_name__` so that the field names would be recorded.  Here
+we call :func:`vars` to lookup the descriptor without triggering it::
+
+    >>> vars(vars(Person)['name'])
+    {'public_name': 'name', 'private_name': '_name'}
+    >>> vars(vars(Person)['age'])
+    {'public_name': 'age', 'private_name': '_age'}
+
+The new class now logs access to both *name* and *age*::
+
+    >>> pete = Person('Peter P', 10)
+    INFO:root:Updating 'name' to 'Peter P'
+    INFO:root:Updating 'age' to 10
+    >>> kate = Person('Catherine C', 20)
+    INFO:root:Updating 'name' to 'Catherine C'
+    INFO:root:Updating 'age' to 20
+
+The two *Person* instances contain only the private names::
+
+    >>> vars(pete)
+    {'_name': 'Peter P', '_age': 10}
+    >>> vars(kate)
+    {'_name': 'Catherine C', '_age': 20}
+
+
+Closing thoughts
+----------------
+
+A :term:`descriptor` is what we call any object that defines :meth:`__get__`,
+:meth:`__set__`, or :meth:`__delete__`.
+
+Descriptors get invoked by the dot operator during attribute lookup.  If a
+descriptor is accessed indirectly with ``vars(some_class)[descriptor_name]``,
+the descriptor instance is returned without invoking it.
+
+Descriptors only work when used as class variables.  When put in instances,
+they have no effect.
+
+The main motivation for descriptors is to provide a hook allowing objects
+stored in class variables to control what happens during dotted lookup.
+
+Traditionally, the calling class controls what happens during lookup.
+Descriptors invert that relationship and allow the data being looked-up to
+have a say in the matter.
+
+Descriptors are used throughout the language.  It is how functions turn into
+bound methods.  Common tools like :func:`classmethod`, :func:`staticmethod`,
+:func:`property`, and :func:`functools.cached_property` are all implemented as
+descriptors.
+
+
+Complete Practical Example
+^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+In this example, we create a practical and powerful tool for locating
+notoriously hard to find data corruption bugs.
+
+
+Validator class
+---------------
+
+A validator is a descriptor for managed attribute access.  Prior to storing
+any data, it verifies that the new value meets various type and range
+restrictions.  If those restrictions aren't met, it raises an exception to
+prevents data corruption at its source.
+
+This :class:`Validator` class is both an :term:`abstract base class` and a
+managed attribute descriptor::
+
+    from abc import ABC, abstractmethod
+
+    class Validator(ABC):
+
+        def __set_name__(self, owner, name):
+            self.private_name = f'_{name}'
+
+        def __get__(self, obj, objtype=None):
+            return getattr(obj, self.private_name)
+
+        def __set__(self, obj, value):
+            self.validate(value)
+            setattr(obj, self.private_name, value)
+
+        @abstractmethod
+        def validate(self, value):
+            pass
+
+Custom validators need to subclass from :class:`Validator` and supply a
+:meth:`validate` method to test various restrictions as needed.
+
+
+Custom validators
+-----------------
+
+Here are three practical data validation utilities:
+
+1) :class:`OneOf` verifies that a value is one of a restricted set of options.
+
+2) :class:`Number` verifies that a value is either an :class:`int` or
+   :class:`float`.  Optionally, it verifies that a value is between a given
+   minimum or maximum.
+
+3) :class:`String` verifies that a value is a :class:`str`.  Optionally, it
+   validates a given minimum or maximum length.  Optionally, it can test for
+   another predicate as well.
+
+::
+
+    class OneOf(Validator):
+
+        def __init__(self, *options):
+            self.options = set(options)
+
+        def validate(self, value):
+            if value not in self.options:
+                raise ValueError(f'Expected {value!r} to be one of {self.options!r}')
+
+    class Number(Validator):
+
+        def __init__(self, minvalue=None, maxvalue=None):
+            self.minvalue = minvalue
+            self.maxvalue = maxvalue
+
+        def validate(self, value):
+            if not isinstance(value, (int, float)):
+                raise TypeError(f'Expected {value!r} to be an int or float')
+            if self.minvalue is not None and value < self.minvalue:
+                raise ValueError(
+                    f'Expected {value!r} to be at least {self.minvalue!r}'
+                )
+            if self.maxvalue is not None and value > self.maxvalue:
+                raise ValueError(
+                    f'Expected {value!r} to be no more than {self.maxvalue!r}'
+                )
+
+    class String(Validator):
+
+        def __init__(self, minsize=None, maxsize=None, predicate=None):
+            self.minsize = minsize
+            self.maxsize = maxsize
+            self.predicate = predicate
+
+        def validate(self, value):
+            if not isinstance(value, str):
+                raise TypeError(f'Expected {value!r} to be an str')
+            if self.minsize is not None and len(value) < self.minsize:
+                raise ValueError(
+                    f'Expected {value!r} to be no smaller than {self.minsize!r}'
+                )
+            if self.maxsize is not None and len(value) > self.maxsize:
+                raise ValueError(
+                    f'Expected {value!r} to be no bigger than {self.maxsize!r}'
+                )
+            if self.predicate is not None and not self.predicate(value):
+                raise ValueError(
+                    f'Expected {self.predicate} to be true for {value!r}'
+                )
+
+
+Practical use
+-------------
+
+Here's how the data validators can be used in a real class::
+
+    class Component:
+
+        name = String(minsize=3, maxsize=10, predicate=str.isupper)
+        kind = OneOf('plastic', 'metal')
+        quantity = Number(minvalue=0)
+
+        def __init__(self, name, kind, quantity):
+            self.name = name
+            self.kind = kind
+            self.quantity = quantity
+
+The descriptors prevent invalid instances from being created::
+
+    Component('WIDGET', 'metal', 5)     # Allowed.
+    Component('Widget', 'metal', 5)     # Blocked: 'Widget' is not all uppercase
+    Component('WIDGET', 'metle', 5)     # Blocked: 'metle' is misspelled
+    Component('WIDGET', 'metal', -5)    # Blocked: -5 is negative
+    Component('WIDGET', 'metal', 'V')   # Blocked: 'V' isn't a number
+
+
+Technical Tutorial
+^^^^^^^^^^^^^^^^^^
+
+What follows is a more technical tutorial for the mechanics and details of how
+descriptors work.
+
+
 Abstract
 --------
 
@@ -39,10 +450,10 @@ Where this occurs in the precedence chain depends on which descriptor methods
 were defined.
 
 Descriptors are a powerful, general purpose protocol.  They are the mechanism
-behind properties, methods, static methods, class methods, and :func:`super()`.
-They are used throughout Python itself to implement the new style classes
-introduced in version 2.2.  Descriptors simplify the underlying C-code and offer
-a flexible set of new tools for everyday Python programs.
+behind properties, methods, static methods, class methods, and
+:func:`super()`.  They are used throughout Python itself.  Descriptors
+simplify the underlying C code and offer a flexible set of new tools for
+everyday Python programs.
 
 
 Descriptor Protocol
@@ -132,11 +543,29 @@ The implementation details are in :c:func:`super_getattro()` in
 The details above show that the mechanism for descriptors is embedded in the
 :meth:`__getattribute__()` methods for :class:`object`, :class:`type`, and
 :func:`super`.  Classes inherit this machinery when they derive from
-:class:`object` or if they have a meta-class providing similar functionality.
+:class:`object` or if they have a metaclass providing similar functionality.
 Likewise, classes can turn-off descriptor invocation by overriding
 :meth:`__getattribute__()`.
 
 
+Automatic Name Notification
+---------------------------
+
+Sometimes it is desirable for a descriptor to know what class variable name it
+was assigned to.  When a new class is created, the :class:`type` metaclass
+scans the dictionary of the new class.  If any of the entries are descriptors
+and if they define :meth:`__set_name__`, that method is called with two
+arguments.  The *owner* is the class where the descriptor is used, the *name*
+is class variable the descriptor was assigned to.
+
+The implementation details are in :c:func:`type_new()` and
+:c:func:`set_names()` in :source:`Objects/typeobject.c`.
+
+Since the update logic is in :meth:`type.__new__`, notifications only take
+place at the time of class creation.  If descriptors are added to the class
+afterwards, :meth:`__set_name__` will need to be called manually.
+
+
 Descriptor Example
 ------------------
 
@@ -154,7 +583,7 @@ descriptor is useful for monitoring just a few chosen attributes::
             self.val = initval
             self.name = name
 
-        def __get__(self, obj, objtype):
+        def __get__(self, obj, objtype=None):
             print('Retrieving', self.name)
             return self.val
 
@@ -162,11 +591,11 @@ descriptor is useful for monitoring just a few chosen attributes::
             print('Updating', self.name)
             self.val = val
 
-    >>> class MyClass:
-    ...     x = RevealAccess(10, 'var "x"')
-    ...     y = 5
-    ...
-    >>> m = MyClass()
+    class B:
+        x = RevealAccess(10, 'var "x"')
+        y = 5
+
+    >>> m = B()
     >>> m.x
     Retrieving var "x"
     10
@@ -251,12 +680,13 @@ affect existing client code accessing the attribute directly.  The solution is
 to wrap access to the value attribute in a property data descriptor::
 
     class Cell:
-        . . .
-        def getvalue(self):
+        ...
+
+        @property
+        def value(self):
             "Recalculate the cell before returning value"
             self.recalc()
             return self._value
-        value = property(getvalue)
 
 
 Functions and Methods
@@ -278,42 +708,48 @@ non-data descriptors which return bound methods when they are invoked from an
 object.  In pure Python, it works like this::
 
     class Function:
-        . . .
+        ...
+
         def __get__(self, obj, objtype=None):
             "Simulate func_descr_get() in Objects/funcobject.c"
             if obj is None:
                 return self
             return types.MethodType(self, obj)
 
-Running the interpreter shows how the function descriptor works in practice::
+Running the following in class in the interpreter shows how the function
+descriptor works in practice::
 
-    >>> class D:
-    ...     def f(self, x):
-    ...         return x
-    ...
-    >>> d = D()
+    class D:
+        def f(self, x):
+             return x
+
+Access through the class dictionary does not invoke :meth:`__get__`.  Instead,
+it just returns the underlying function object::
 
-    # Access through the class dictionary does not invoke __get__.
-    # It just returns the underlying function object.
     >>> D.__dict__['f']
     <function D.f at 0x00C45070>
 
-    # Dotted access from a class calls __get__() which just returns
-    # the underlying function unchanged.
+Dotted access from a class calls :meth:`__get__` which just returns the
+underlying function unchanged::
+
     >>> D.f
     <function D.f at 0x00C45070>
 
-    # The function has a __qualname__ attribute to support introspection
+The function has a :term:`qualified name` attribute to support introspection::
+
     >>> D.f.__qualname__
     'D.f'
 
-    # Dotted access from an instance calls __get__() which returns the
-    # function wrapped in a bound method object
+Dotted access from an instance calls :meth:`__get__` which returns a bound
+method object::
+
+    >>> d = D()
     >>> d.f
     <bound method D.f of <__main__.D object at 0x00B18C90>>
 
-    # Internally, the bound method stores the underlying function and
-    # the bound instance.
+Internally, the bound method stores the underlying function and the bound
+instance::
+
     >>> d.f.__func__
     <function D.f at 0x1012e5ae8>
     >>> d.f.__self__
@@ -328,20 +764,20 @@ patterns of binding functions into methods.
 
 To recap, functions have a :meth:`__get__` method so that they can be converted
 to a method when accessed as attributes.  The non-data descriptor transforms an
-``obj.f(*args)`` call into ``f(obj, *args)``.  Calling ``klass.f(*args)``
+``obj.f(*args)`` call into ``f(obj, *args)``.  Calling ``cls.f(*args)``
 becomes ``f(*args)``.
 
 This chart summarizes the binding and its two most useful variants:
 
       +-----------------+----------------------+------------------+
       | Transformation  | Called from an       | Called from a    |
-      |                 | Object               | Class            |
+      |                 | object               | class            |
       +=================+======================+==================+
       | function        | f(obj, \*args)       | f(\*args)        |
       +-----------------+----------------------+------------------+
       | staticmethod    | f(\*args)            | f(\*args)        |
       +-----------------+----------------------+------------------+
-      | classmethod     | f(type(obj), \*args) | f(klass, \*args) |
+      | classmethod     | f(type(obj), \*args) | f(cls, \*args)   |
       +-----------------+----------------------+------------------+
 
 Static methods return the underlying function without changes.  Calling either
@@ -365,11 +801,11 @@ It can be called either from an object or the class:  ``s.erf(1.5) --> .9332`` o
 Since staticmethods return the underlying function with no changes, the example
 calls are unexciting::
 
-    >>> class E:
-    ...     def f(x):
-    ...         print(x)
-    ...     f = staticmethod(f)
-    ...
+    class E:
+        @staticmethod
+        def f(x):
+            print(x)
+
     >>> E.f(3)
     3
     >>> E().f(3)
@@ -391,32 +827,33 @@ Unlike static methods, class methods prepend the class reference to the
 argument list before calling the function.  This format is the same
 for whether the caller is an object or a class::
 
-    >>> class E:
-    ...     def f(klass, x):
-    ...         return klass.__name__, x
-    ...     f = classmethod(f)
-    ...
-    >>> print(E.f(3))
-    ('E', 3)
-    >>> print(E().f(3))
-    ('E', 3)
+    class F:
+        @classmethod
+        def f(cls, x):
+            return cls.__name__, x
+
+    >>> print(F.f(3))
+    ('F', 3)
+    >>> print(F().f(3))
+    ('F', 3)
 
 
 This behavior is useful whenever the function only needs to have a class
-reference and does not care about any underlying data.  One use for classmethods
-is to create alternate class constructors.  In Python 2.3, the classmethod
+reference and does not care about any underlying data.  One use for
+classmethods is to create alternate class constructors.  The classmethod
 :func:`dict.fromkeys` creates a new dictionary from a list of keys.  The pure
 Python equivalent is::
 
     class Dict:
-        . . .
-        def fromkeys(klass, iterable, value=None):
+        ...
+
+        @classmethod
+        def fromkeys(cls, iterable, value=None):
             "Emulate dict_fromkeys() in Objects/dictobject.c"
-            d = klass()
+            d = cls()
             for key in iterable:
                 d[key] = value
             return d
-        fromkeys = classmethod(fromkeys)
 
 Now a new dictionary of unique keys can be constructed like this::
 
@@ -432,10 +869,9 @@ Using the non-data descriptor protocol, a pure Python version of
         def __init__(self, f):
             self.f = f
 
-        def __get__(self, obj, klass=None):
-            if klass is None:
-                klass = type(obj)
+        def __get__(self, obj, cls=None):
+            if cls is None:
+                cls = type(obj)
             def newfunc(*args):
-                return self.f(klass, *args)
+                return self.f(cls, *args)
             return newfunc
-
diff --git a/Doc/tools/susp-ignored.csv b/Doc/tools/susp-ignored.csv
index e03c68fa98a0b..80e77be45f95b 100644
--- a/Doc/tools/susp-ignored.csv
+++ b/Doc/tools/susp-ignored.csv
@@ -23,6 +23,9 @@ howto/curses,,:blue,"2:green, 3:yellow, 4:blue, 5:magenta, 6:cyan, and 7:white.
 howto/curses,,:magenta,"2:green, 3:yellow, 4:blue, 5:magenta, 6:cyan, and 7:white.  The"
 howto/curses,,:cyan,"2:green, 3:yellow, 4:blue, 5:magenta, 6:cyan, and 7:white.  The"
 howto/curses,,:white,"2:green, 3:yellow, 4:blue, 5:magenta, 6:cyan, and 7:white.  The"
+howto/descriptor,,:root,"INFO:root"
+howto/descriptor,,:Updating,"root:Updating"
+howto/descriptor,,:Accessing,"root:Accessing"
 howto/instrumentation,,::,python$target:::function-entry
 howto/instrumentation,,:function,python$target:::function-entry
 howto/instrumentation,,::,python$target:::function-return



More information about the Python-checkins mailing list