Hi all, PEP 484 specifies semantics for type hints. These type hints are used by various tools, including static type checkers. However, PEP 484 only specifies the semantics for nominal subtyping (subtyping based on subclassing). Here we propose a specification for semantics of structural subtyping (static duck typing). Previous discussions on this PEP happened at: https://mail.python.org/pipermail/python-ideas/2015-September/thread.html#35... https://github.com/python/typing/issues/11 https://github.com/python/peps/pull/224 -- Ivan =========================================== PEP: 544 Title: Protocols Version: $Revision$ Last-Modified: $Date$ Author: Ivan Levkivskyi <levkivskyi@gmail.com>, Jukka Lehtosalo < jukka.lehtosalo@iki.fi>, Łukasz Langa <lukasz@langa.pl> Discussions-To: Python-Dev <python-dev@python.org> Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 05-Mar-2017 Python-Version: 3.7 Abstract ======== Type hints introduced in PEP 484 can be used to specify type metadata for static type checkers and other third party tools. However, PEP 484 only specifies the semantics of *nominal* subtyping. In this PEP we specify static and runtime semantics of protocol classes that will provide a support for *structural* subtyping (static duck typing). .. _rationale: Rationale and Goals =================== Currently, PEP 484 and the ``typing`` module [typing]_ define abstract base classes for several common Python protocols such as ``Iterable`` and ``Sized``. The problem with them is that a class has to be explicitly marked to support them, which is unpythonic and unlike what one would normally do in idiomatic dynamically typed Python code. For example, this conforms to PEP 484:: from typing import Sized, Iterable, Iterator class Bucket(Sized, Iterable[int]): ... def __len__(self) -> int: ... def __iter__(self) -> Iterator[int]: ... The same problem appears with user-defined ABCs: they must be explicitly subclassed or registered. This is particularly difficult to do with library types as the type objects may be hidden deep in the implementation of the library. Also, extensive use of ABCs might impose additional runtime costs. The intention of this PEP is to solve all these problems by allowing users to write the above code without explicit base classes in the class definition, allowing ``Bucket`` to be implicitly considered a subtype of both ``Sized`` and ``Iterable[int]`` by static type checkers using structural [wiki-structural]_ subtyping:: from typing import Iterator, Iterable class Bucket: ... def __len__(self) -> int: ... def __iter__(self) -> Iterator[int]: ... def collect(items: Iterable[int]) -> int: ... result: int = collect(Bucket()) # Passes type check Note that ABCs in ``typing`` module already provide structural behavior at runtime, ``isinstance(Bucket(), Iterable)`` returns ``True``. The main goal of this proposal is to support such behavior statically. The same functionality will be provided for user-defined protocols, as specified below. The above code with a protocol class matches common Python conventions much better. It is also automatically extensible and works with additional, unrelated classes that happen to implement the required protocol. Nominal vs structural subtyping ------------------------------- Structural subtyping is natural for Python programmers since it matches the runtime semantics of duck typing: an object that has certain properties is treated independently of its actual runtime class. However, as discussed in PEP 483, both nominal and structural subtyping have their strengths and weaknesses. Therefore, in this PEP we *do not propose* to replace the nominal subtyping described by PEP 484 with structural subtyping completely. Instead, protocol classes as specified in this PEP complement normal classes, and users are free to choose where to apply a particular solution. See section on `rejected`_ ideas at the end of this PEP for additional motivation. Non-goals --------- At runtime, protocol classes will be simple ABCs. There is no intent to provide sophisticated runtime instance and class checks against protocol classes. This would be difficult and error-prone and will contradict the logic of PEP 484. As well, following PEP 484 and PEP 526 we state that protocols are **completely optional**: * No runtime semantics will be imposed for variables or parameters annotated with a protocol class. * Any checks will be performed only by third-party type checkers and other tools. * Programmers are free to not use them even if they use type annotations. * There is no intent to make protocols non-optional in the future. Existing Approaches to Structural Subtyping =========================================== Before describing the actual specification, we review and comment on existing approaches related to structural subtyping in Python and other languages: * ``zope.interface`` [zope-interfaces]_ was one of the first widely used approaches to structural subtyping in Python. It is implemented by providing special classes to distinguish interface classes from normal classes, to mark interface attributes, and to explicitly declare implementation. For example:: from zope.interface import Interface, Attribute, implements class IEmployee(Interface): name = Attribute("Name of employee") def do(work): """Do some work""" class Employee(object): implements(IEmployee) name = 'Anonymous' def do(self, work): return work.start() Zope interfaces support various contracts and constraints for interface classes. For example:: from zope.interface import invariant def required_contact(obj): if not (obj.email or obj.phone): raise Exception("At least one contact info is required") class IPerson(Interface): name = Attribute("Name") email = Attribute("Email Address") phone = Attribute("Phone Number") invariant(required_contact) Even more detailed invariants are supported. However, Zope interfaces rely entirely on runtime validation. Such focus on runtime properties goes beyond the scope of the current proposal, and static support for invariants might be difficult to implement. However, the idea of marking an interface class with a special base class is reasonable and easy to implement both statically and at runtime. * Python abstract base classes [abstract-classes]_ are the standard library tool to provide some functionality similar to structural subtyping. The drawback of this approach is the necessity to either subclass the abstract class or register an implementation explicitly:: from abc import ABC class MyTuple(ABC): pass MyTuple.register(tuple) assert issubclass(tuple, MyTuple) assert isinstance((), MyTuple) As mentioned in the `rationale`_, we want to avoid such necessity, especially in static context. However, in a runtime context, ABCs are good candidates for protocol classes and they are already used extensively in the ``typing`` module. * Abstract classes defined in ``collections.abc`` module [collections-abc]_ are slightly more advanced since they implement a custom ``__subclasshook__()`` method that allows runtime structural checks without explicit registration:: from collections.abc import Iterable class MyIterable: def __iter__(self): return [] assert isinstance(MyIterable(), Iterable) Such behavior seems to be a perfect fit for both runtime and static behavior of protocols. As discussed in `rationale`_, we propose to add static support for such behavior. In addition, to allow users to achieve such runtime behavior for *user defined* protocols a special ``@runtime`` decorator will be provided, see detailed `discussion`_ below. * TypeScript [typescript]_ provides support for user defined classes and interfaces. Explicit implementation declaration is not required and structural subtyping is verified statically. For example:: interface LabeledItem { label: string; size?: int; } function printLabel(obj: LabeledValue) { console.log(obj.label); } let myObj = {size: 10, label: "Size 10 Object"}; printLabel(myObj); Note that optional interface members are supported. Also, TypeScript prohibits redundant members in implementations. While the idea of optional members looks interesting, it would complicate this proposal and it is not clear how useful it will be. Therefore it is proposed to postpone this; see `rejected`_ ideas. In general, the idea of static protocol checking without runtime implications looks reasonable, and basically this proposal follows the same line. * Go [golang]_ uses a more radical approach and makes interfaces the primary way to provide type information. Also, assignments are used to explicitly ensure implementation:: type SomeInterface interface { SomeMethod() ([]byte, error) } if _, ok := someval.(SomeInterface); ok { fmt.Printf("value implements some interface") } Both these ideas are questionable in the context of this proposal. See the section on `rejected`_ ideas. .. _specification: Specification ============= Terminology ----------- We propose to use the term *protocols* for types supporting structural subtyping. The reason is that the term *iterator protocol*, for example, is widely understood in the community, and coming up with a new term for this concept in a statically typed context would just create confusion. This has the drawback that the term *protocol* becomes overloaded with two subtly different meanings: the first is the traditional, well-known but slightly fuzzy concept of protocols such as iterator; the second is the more explicitly defined concept of protocols in statically typed code. The distinction is not important most of the time, and in other cases we propose to just add a qualifier such as *protocol classes* when referring to the static type concept. If a class includes a protocol in its MRO, the class is called an *explicit* subclass of the protocol. If a class is a structural subtype of a protocol, it is said to implement the protocol and to be compatible with a protocol. If a class is compatible with a protocol but the protocol is not included in the MRO, the class is an *implicit* subtype of the protocol. The attributes (variables and methods) of a protocol that are mandatory for other class in order to be considered a structural subtype are called protocol members. .. _definition: Defining a protocol ------------------- Protocols are defined by including a special new class ``typing.Protocol`` (an instance of ``abc.ABCMeta``) in the base classes list, preferably at the end of the list. Here is a simple example:: from typing import Protocol class SupportsClose(Protocol): def close(self) -> None: ... Now if one defines a class ``Resource`` with a ``close()`` method that has a compatible signature, it would implicitly be a subtype of ``SupportsClose``, since the structural subtyping is used for protocol types:: class Resource: ... def close(self) -> None: self.file.close() self.lock.release() Apart from few restrictions explicitly mentioned below, protocol types can be used in every context where a normal types can:: def close_all(things: Iterable[SupportsClose]) -> None: for t in things: t.close() f = open('foo.txt') r = Resource() close_all([f, r]) # OK! close_all([1]) # Error: 'int' has no 'close' method Note that both the user-defined class ``Resource`` and the built-in ``IO`` type (the return type of ``open()``) are considered subtypes of ``SupportsClose``, because they provide a ``close()`` method with a compatible type signature. Protocol members ---------------- All methods defined in the protocol class body are protocol members, both normal and decorated with ``@abstractmethod``. If some or all parameters of protocol method are not annotated, then their types are assumed to be ``Any`` (see PEP 484). Bodies of protocol methods are type checked, except for methods decorated with ``@abstractmethod`` with trivial bodies. A trivial body can contain a docstring. Example:: from typing import Protocol from abc import abstractmethod class Example(Protocol): def first(self) -> int: # This is a protocol member return 42 @abstractmethod def second(self) -> int: # Method without a default implementation """Some method.""" Note that although formally the implicit return type of a method with a trivial body is ``None``, type checker will not warn about above example, such convention is similar to how methods are defined in stub files. Static methods, class methods, and properties are equally allowed in protocols. To define a protocol variable, one must use PEP 526 variable annotations in the class body. Additional attributes *only* defined in the body of a method by assignment via ``self`` are not allowed. The rationale for this is that the protocol class implementation is often not shared by subtypes, so the interface should not depend on the default implementation. Examples:: from typing import Protocol, List class Template(Protocol): name: str # This is a protocol member value: int = 0 # This one too (with default) def method(self) -> None: self.temp: List[int] = [] # Error in type checker To distinguish between protocol class variables and protocol instance variables, the special ``ClassVar`` annotation should be used as specified by PEP 526. Explicitly declaring implementation ----------------------------------- To explicitly declare that a certain class implements the given protocols, they can be used as regular base classes. In this case a class could use default implementations of protocol members. ``typing.Sequence`` is a good example of a protocol with useful default methods. Abstract methods with trivial bodies are recognized by type checkers as having no default implementation and can't be used via ``super()`` in explicit subclasses. The default implementations can not be used if the subtype relationship is implicit and only via structural subtyping -- the semantics of inheritance is not changed. Examples:: class PColor(Protocol): @abstractmethod def draw(self) -> str: ... def complex_method(self) -> int: # some complex code here class NiceColor(PColor): def draw(self) -> str: return "deep blue" class BadColor(PColor): def draw(self) -> str: return super().draw() # Error, no default implementation class ImplicitColor: # Note no 'PColor' base here def draw(self) -> str: return "probably gray" def comlex_method(self) -> int: # class needs to implement this nice: NiceColor another: ImplicitColor def represent(c: PColor) -> None: print(c.draw(), c.complex_method()) represent(nice) # OK represent(another) # Also OK Note that there is no conceptual difference between explicit and implicit subtypes, the main benefit of explicit subclassing is to get some protocol methods "for free". In addition, type checkers can statically verify that the class actually implements the protocol correctly:: class RGB(Protocol): rgb: Tuple[int, int, int] @abstractmethod def intensity(self) -> int: return 0 class Point(RGB): def __init__(self, red: int, green: int, blue: str) -> None: self.rgb = red, green, blue # Error, 'blue' must be 'int' # Type checker might warn that 'intensity' is not defined A class can explicitly inherit from multiple protocols and also form normal classes. In this case methods are resolved using normal MRO and a type checker verifies that all subtyping are correct. The semantics of ``@abstractmethod`` is not changed, all of them must be implemented by an explicit subclass before it could be instantiated. Merging and extending protocols ------------------------------- The general philosophy is that protocols are mostly like regular ABCs, but a static type checker will handle them specially. Subclassing a protocol class would not turn the subclass into a protocol unless it also has ``typing.Protocol`` as an explicit base class. Without this base, the class is "downgraded" to a regular ABC that cannot be used with structural subtyping. A subprotocol can be defined by having *both* one or more protocols as immediate base classes and also having ``typing.Protocol`` as an immediate base class:: from typing import Sized, Protocol class SizedAndCloseable(Sized, Protocol): def close(self) -> None: ... Now the protocol ``SizedAndCloseable`` is a protocol with two methods, ``__len__`` and ``close``. If one omits ``Protocol`` in the base class list, this would be a regular (non-protocol) class that must implement ``Sized``. If ``Protocol`` is included in the base class list, all the other base classes must be protocols. A protocol can't extend a regular class. Alternatively, one can implement ``SizedAndCloseable`` like this, assuming the existence of ``SupportsClose`` from the example in `definition`_ section:: from typing import Sized class SupportsClose(...): ... # Like above class SizedAndCloseable(Sized, SupportsClose, Protocol): pass The two definitions of ``SizedAndClosable`` are equivalent. Subclass relationships between protocols are not meaningful when considering subtyping, since structural compatibility is the criterion, not the MRO. Note that rules around explicit subclassing are different from regular ABCs, where abstractness is simply defined by having at least one abstract method being unimplemented. Protocol classes must be marked *explicitly*. Generic and recursive protocols ------------------------------- Generic protocols are important. For example, ``SupportsAbs``, ``Iterable`` and ``Iterator`` are generic protocols. They are defined similar to normal non-protocol generic types:: T = TypeVar('T', covariant=True) class Iterable(Protocol[T]): @abstractmethod def __iter__(self) -> Iterator[T]: ... Note that ``Protocol[T, S, ...]`` is allowed as a shorthand for ``Protocol, Generic[T, S, ...]``. Recursive protocols are also supported. Forward references to the protocol class names can be given as strings as specified by PEP 484. Recursive protocols will be useful for representing self-referential data structures like trees in an abstract fashion:: class Traversable(Protocol): leaves: Iterable['Traversable'] Using Protocols =============== Subtyping relationships with other types ---------------------------------------- Protocols cannot be instantiated, so there are no values with protocol types. For variables and parameters with protocol types, subtyping relationships are subject to the following rules: * A protocol is never a subtype of a concrete type. * A concrete type or a protocol ``X`` is a subtype of another protocol ``P`` if and only if ``X`` implements all protocol members of ``P``. In other words, subtyping with respect to a protocol is always structural. * Edge case: for recursive protocols, a class is considered a subtype of the protocol in situations where such decision depends on itself. Continuing the previous example:: class Tree(Generic[T]): def __init__(self, value: T, leaves: 'List[Tree[T]]') -> None: self.value = value self.leaves = leaves def walk(graph: Traversable) -> None: ... tree: Tree[float] = Tree(0, []) walk(tree) # OK, 'Tree[float]' is a subtype of 'Traversable' Generic protocol types follow the same rules of variance as non-protocol types. Protocol types can be used in all contexts where any other types can be used, such as in ``Union``, ``ClassVar``, type variables bounds, etc. Generic protocols follow the rules for generic abstract classes, except for using structural compatibility instead of compatibility defined by inheritance relationships. Unions and intersections of protocols ------------------------------------- ``Union`` of protocol classes behaves the same way as for non-protocol classes. For example:: from typing import Union, Optional, Protocol class Exitable(Protocol): def exit(self) -> int: ... class Quitable(Protocol): def quit(self) -> Optional[int]: ... def finish(task: Union[Exitable, Quitable]) -> int: ... class GoodJob: ... def quit(self) -> int: return 0 finish(GoodJob()) # OK One can use multiple inheritance to define an intersection of protocols. Example:: from typing import Sequence, Hashable class HashableFloats(Sequence[float], Hashable, Protocol): pass def cached_func(args: HashableFloats) -> float: ... cached_func((1, 2, 3)) # OK, tuple is both hashable and sequence If this will prove to be a widely used scenario, then a special intersection type construct may be added in future as specified by PEP 483, see `rejected`_ ideas for more details. ``Type[]`` with protocols ------------------------- Variables and parameters annotated with ``Type[Proto]`` accept only concrete (non-protocol) subtypes of ``Proto``. The main reason for this is to allow instantiation of parameters with such type. For example:: class Proto(Protocol): @abstractmethod def meth(self) -> int: ... class Concrete: def meth(self) -> int: return 42 def fun(cls: Type[Proto]) -> int: return cls().meth() # OK fun(Proto) # Error fun(Concrete) # OK The same rule applies to variables:: var: Type[Proto] var = Proto # Error var = Concrete # OK var().meth() # OK Assigning an ABC or a protocol class to a variable is allowed if it is not explicitly typed, and such assignment creates a type alias. For normal (non-abstract) classes, the behavior of ``Type[]`` is not changed. ``NewType()`` and type aliases ------------------------------ Protocols are essentially anonymous. To emphasize this point, static type checkers might refuse protocol classes inside ``NewType()`` to avoid an illusion that a distinct type is provided:: form typing import NewType , Protocol, Iterator class Id(Protocol): code: int secrets: Iterator[bytes] UserId = NewType('UserId', Id) # Error, can't provide distinct type On the contrary, type aliases are fully supported, including generic type aliases:: from typing import TypeVar, Reversible, Iterable, Sized T = TypeVar('T') class SizedIterable(Iterable[T], Sized, Protocol): pass CompatReversible = Union[Reversible[T], SizedIterable[T]] .. _discussion: ``@runtime`` decorator and narrowing types by ``isinstance()`` -------------------------------------------------------------- The default semantics is that ``isinstance()`` and ``issubclass()`` fail for protocol types. This is in the spirit of duck typing -- protocols basically would be used to model duck typing statically, not explicitly at runtime. However, it should be possible for protocol types to implement custom instance and class checks when this makes sense, similar to how ``Iterable`` and other ABCs in ``collections.abc`` and ``typing`` already do it, but this is limited to non-generic and unsubscripted generic protocols (``Iterable`` is statically equivalent to ``Iterable[Any]`). The ``typing`` module will define a special ``@runtime`` class decorator that provides the same semantics for class and instance checks as for ``collections.abc`` classes, essentially making them "runtime protocols":: from typing import runtime, Protocol @runtime class Closeable(Protocol): def close(self): ... assert isinstance(open('some/file'), Closeable) Static type checkers will understand ``isinstance(x, Proto)`` and ``issubclass(C, Proto)`` for protocols defined with this decorator (as they already do for ``Iterable`` etc.). Static type checkers will narrow types after such checks by the type erased ``Proto`` (i.e. with all variables having type ``Any`` and all methods having type ``Callable[..., Any]``). Note that ``isinstance(x, Proto[int])`` etc. will always fail in agreement with PEP 484. Examples:: from typing import Iterable, Iterator, Sequence def process(items: Iterable[int]) -> None: if isinstance(items, Iterator): # 'items' have type 'Iterator[int]' here elif isinstance(items, Sequence[int]): # Error! Can't use 'isinstance()' with subscripted protocols Note that instance checks are not 100% reliable statically, this is why this behavior is opt-in, see section on `rejected`_ ideas for examples. Using Protocols in Python 2.7 - 3.5 =================================== Variable annotation syntax was added in Python 3.6, so that the syntax for defining protocol variables proposed in `specification`_ section can't be used in earlier versions. To define these in earlier versions of Python one can use properties:: class Foo(Protocol): @property def c(self) -> int: return 42 # Default value can be provided for property... @abstractproperty def d(self) -> int: # ... or it can be abstract return 0 In Python 2.7 the function type comments should be used as per PEP 484. The ``typing`` module changes proposed in this PEP will be also backported to earlier versions via the backport currently available on PyPI. Runtime Implementation of Protocol Classes ========================================== Implementation details ---------------------- The runtime implementation could be done in pure Python without any effects on the core interpreter and standard library except in the ``typing`` module: * Define class ``typing.Protocol`` similar to ``typing.Generic``. * Implement metaclass functionality to detect whether a class is a protocol or not. Add a class attribute ``__protocol__ = True`` if that is the case. Verify that a protocol class only has protocol base classes in the MRO (except for object). * Implement ``@runtime`` that adds all attributes to ``__subclasshook__()``. * All structural subtyping checks will be performed by static type checkers, such as ``mypy`` [mypy]_. No additional support for protocol validation will be provided at runtime. Changes in the typing module ---------------------------- The following classes in ``typing`` module will be protocols: * ``Hashable`` * ``SupportsAbs`` (and other ``Supports*`` classes) * ``Iterable``, ``Iterator`` * ``Sized`` * ``Container`` * ``Collection`` * ``Reversible`` * ``Sequence``, ``MutableSequence`` * ``AbstractSet``, ``MutableSet`` * ``Mapping``, ``MutableMapping`` * ``ItemsView`` (and other ``*View`` classes) * ``AsyncIterable``, ``AsyncIterator`` * ``Awaitable`` * ``Callable`` * ``ContextManager``, ``AsyncContextManager`` Most of these classes are small and conceptually simple. It is easy to see what are the methods these protocols implement, and immediately recognize the corresponding runtime protocol counterpart. Practically, few changes will be needed in ``typing`` since some of these classes already behave the necessary way at runtime. Most of these will need to be updated only in the corresponding ``typeshed`` stubs [typeshed]_. All other concrete generic classes such as ``List``, ``Set``, ``IO``, ``Deque``, etc are sufficiently complex that it makes sense to keep them non-protocols (i.e. require code to be explicit about them). Also, it is too easy to leave some methods unimplemented by accident, and explicitly marking the subclass relationship allows type checkers to pinpoint the missing implementations. Introspection ------------- The existing class introspection machinery (``dir``, ``__annotations__`` etc) can be used with protocols. In addition, all introspection tools implemented in the ``typing`` module will support protocols. Since all attributes need to be defined in the class body based on this proposal, protocol classes will have even better perspective for introspection than regular classes where attributes can be defined implicitly -- protocol attributes can't be initialized in ways that are not visible to introspection (using ``setattr()``, assignment via ``self``, etc.). Still, some things like types of attributes will not be visible at runtime in Python 3.5 and earlier, but this looks like a reasonable limitation. There will be only limited support of ``isinstance()`` and ``issubclass()`` as discussed above (these will *always* fail with ``TypeError`` for subscripted generic protocols, since a reliable answer could not be given at runtime in this case). But together with other introspection tools this give a reasonable perspective for runtime type checking tools. .. _rejected: Rejected/Postponed Ideas ======================== The ideas in this section were previously discussed in [several]_ [discussions]_ [elsewhere]_. Make every class a protocol by default -------------------------------------- Some languages such as Go make structural subtyping the only or the primary form of subtyping. We could achieve a similar result by making all classes protocols by default (or even always). However we believe that it is better to require classes to be explicitly marked as protocols, for the following reasons: * Protocols don't have some properties of regular classes. In particular, ``isinstance()``, as defined for normal classes, is based on the nominal hierarchy. In order to make everything a protocol by default, and have ``isinstance()`` work would require changing its semantics, which won't happen. * Protocol classes should generally not have many method implementations, as they describe an interface, not an implementation. Most classes have many implementations, making them bad protocol classes. * Experience suggests that many classes are not practical as protocols anyway, mainly because their interfaces are too large, complex or implementation-oriented (for example, they may include de facto private attributes and methods without a ``__`` prefix). * Most actually useful protocols in existing Python code seem to be implicit. The ABCs in ``typing`` and ``collections.abc`` are rather an exception, but even they are recent additions to Python and most programmers do not use them yet. * Many built-in functions only accept concrete instances of ``int`` (and subclass instances), and similarly for other built-in classes. Making ``int`` a structural type wouldn't be safe without major changes to the Python runtime, which won't happen. Support optional protocol members --------------------------------- We can come up with examples where it would be handy to be able to say that a method or data attribute does not need to be present in a class implementing a protocol, but if it is present, it must conform to a specific signature or type. One could use a ``hasattr()`` check to determine whether they can use the attribute on a particular instance. Languages such as TypeScript have similar features and apparently they are pretty commonly used. The current realistic potential use cases for protocols in Python don't require these. In the interest of simplicity, we propose to not support optional methods or attributes. We can always revisit this later if there is an actual need. Make protocols interoperable with other approaches -------------------------------------------------- The protocols as described here are basically a minimal extension to the existing concept of ABCs. We argue that this is the way they should be understood, instead of as something that *replaces* Zope interfaces, for example. Attempting such interoperabilities will significantly complicate both the concept and the implementation. On the other hand, Zope interfaces are conceptually a superset of protocols defined here, but using an incompatible syntax to define them, because before PEP 526 there was no straightforward way to annotate attributes. In the 3.6+ world, ``zope.interface`` might potentially adopt the ``Protocol`` syntax. In this case, type checkers could be taught to recognize interfaces as protocols and make simple structural checks with respect to them. Use assignments to check explicitly that a class implements a protocol ---------------------------------------------------------------------- In Go language the explicit checks for implementation are performed via dummy assignments [golang]_. Such a way is also possible with the current proposal. Example:: class A: def __len__(self) -> float: return ... _: Sized = A() # Error: A.__len__ doesn't conform to 'Sized' # (Incompatible return type 'float') This approach moves the check away from the class definition and it almost requires a comment as otherwise the code probably would not make any sense to an average reader -- it looks like dead code. Besides, in the simplest form it requires one to construct an instance of ``A``, which could be problematic if this requires accessing or allocating some resources such as files or sockets. We could work around the latter by using a cast, for example, but then the code would be ugly. Therefore we discourage the use of this pattern. Support ``isinstance()`` checks by default ------------------------------------------ The problem with this is instance checks could be unreliable, except for situations where there is a common signature convention such as ``Iterable``. For example:: class P(Protocol): def common_method_name(self, x: int) -> int: ... class X: <a bunch of methods> def common_method_name(self) -> None: ... # Note different signature def do_stuff(o: Union[P, X]) -> int: if isinstance(o, P): return o.common_method_name(1) # oops, what if it's an X instance? Another potentially problematic case is assignment of attributes *after* instantiation:: class P(Protocol): x: int class C: def initialize(self) -> None: self.x = 0 c = C() isinstance(c1, P) # False c.initialize() isinstance(c, P) # True def f(x: Union[P, int]) -> None: if isinstance(x, P): # static type of x is P here ... else: # type of x is "int" here? print(x + 1) f(C()) # oops We argue that requiring an explicit class decorator would be better, since one can then attach warnings about problems like this in the documentation. The user would be able to evaluate whether the benefits outweigh the potential for confusion for each protocol and explicitly opt in -- but the default behavior would be safer. Finally, it will be easy to make this behavior default if necessary, while it might be problematic to make it opt-in after being default. Provide a special intersection type construct --------------------------------------------- There was an idea to allow ``Proto = All[Proto1, Proto2, ...]`` as a shorthand for:: class Proto(Proto1, Proto2, ..., Protocol): pass However, it is not yet clear how popular/useful it will be and implementing this in type checkers for non-protocol classes could be difficult. Finally, it will be very easy to add this later if needed. References ========== .. [typing] https://docs.python.org/3/library/typing.html .. [wiki-structural] https://en.wikipedia.org/wiki/Structural_type_system .. [zope-interfaces] https://zopeinterface.readthedocs.io/en/latest/ .. [abstract-classes] https://docs.python.org/3/library/abc.html .. [collections-abc] https://docs.python.org/3/library/collections.abc.html .. [typescript] https://www.typescriptlang.org/docs/handbook/interfaces.html .. [golang] https://golang.org/doc/effective_go.html#interfaces_and_types .. [typeshed] https://github.com/python/typeshed/ .. [mypy] http://github.com/python/mypy/ .. [several] https://mail.python.org/pipermail/python-ideas/2015-September/thread.html#35... .. [discussions] https://github.com/python/typing/issues/11 .. [elsewhere] https://github.com/python/peps/pull/224 Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End:
I'm overall very supportive of seeing something like this make it into Python to further strengthen duck typing in the language. I know I've wanted something something like this since ABCs were introduced. I personally only have one issue/clarification for the PEP. On Mon, 20 Mar 2017 at 05:02 Ivan Levkivskyi <levkivskyi@gmail.com> wrote:
[SNIP] Protocol members ----------------
All methods defined in the protocol class body are protocol members, both normal and decorated with ``@abstractmethod``. If some or all parameters of protocol method are not annotated, then their types are assumed to be ``Any`` (see PEP 484). Bodies of protocol methods are type checked, except for methods decorated with ``@abstractmethod`` with trivial bodies. A trivial body can contain a docstring.
What is a "trivial body"? I don't know of any such definition anywhere in Python so this is too loosely defined. You also don't say what happens if the body isn't trivial. Are tools expected to raise an error?
Example::
from typing import Protocol from abc import abstractmethod
class Example(Protocol): def first(self) -> int: # This is a protocol member return 42
@abstractmethod def second(self) -> int: # Method without a default implementation """Some method."""
Note that although formally the implicit return type of a method with a trivial body is ``None``,
This seems to suggest a trivial body is anything lacking a return statement.
type checker will not warn about above example, such convention is similar to how methods are defined in stub files. Static methods, class methods, and properties are equally allowed in protocols.
Personally, I think even an abstract method should be properly typed. So in the example above, second() should either return a reasonable default value or raise NotImplementedError. My argument is "explicit is better than implicit" and you make errors when people call super() on an abstract method that doesn't return None when it doesn't make sense. I would also argue that you can't expect an abstract method to always be simple. For instance, I might define an abstract method that has horrible complexity characteristics (e.g. O(n**2)), but which might be acceptable in select cases. By making the method abstract you force subclasses to explicitly opt-in to using the potentially horrible implementation. -Brett
To define a protocol variable, one must use PEP 526 variable annotations in the class body. Additional attributes *only* defined in the body of a method by assignment via ``self`` are not allowed. The rationale for this is that the protocol class implementation is often not shared by subtypes, so the interface should not depend on the default implementation. Examples::
from typing import Protocol, List
class Template(Protocol): name: str # This is a protocol member value: int = 0 # This one too (with default)
def method(self) -> None: self.temp: List[int] = [] # Error in type checker
To distinguish between protocol class variables and protocol instance variables, the special ``ClassVar`` annotation should be used as specified by PEP 526.
On 20 March 2017 at 19:07, Brett Cannon <brett@python.org> wrote:
I'm overall very supportive of seeing something like this make it into Python to further strengthen duck typing in the language.
Thanks!
Personally, I think even an abstract method should be properly typed.
[SNIP]
or raise NotImplementedError.
Yes, I think this is a reasonable requirement. (Also assuming unconditional raise is a bottom type, raising body is properly typed). Initially I thought a type checker could warn about invalid calls to super(), but this complicates things, and indeed "explicit is better than implicit". -- Ivan
On Mon, Mar 20, 2017, at 14:07, Brett Cannon wrote:
What is a "trivial body"? I don't know of any such definition anywhere in Python so this is too loosely defined. You also don't say what happens if the body isn't trivial. Are tools expected to raise an error?
My assumption would be that a trivial body is any body consisting of only a docstring and/or a "pass" statement.
I'm a big fan of this. I really want structural subtyping for http://github.com/google/pytype. On Mon, Mar 20, 2017 at 5:00 AM, Ivan Levkivskyi <levkivskyi@gmail.com> wrote:
Explicitly declaring implementation -----------------------------------
To explicitly declare that a certain class implements the given protocols,
Why is this necessary? The whole point of ducktyping is that you *don't* have to declare what you implement. I get that it looks convenient to have your protocol A also supply some of the methods you'd expect classes of type A to have. But completing an implementation in that way should be done explicitly (via including a utility class or using a decorator like functools.total_ordering), not as side-effect of an (unnecessary) protocol declaration.
On 20 March 2017 at 22:11, Matthias Kramm <kramm@google.com> wrote:
I'm a big fan of this. I really want structural subtyping for http://github.com/google/pytype.
I am glad you like it.
On Mon, Mar 20, 2017 at 5:00 AM, Ivan Levkivskyi <levkivskyi@gmail.com>
wrote:
Explicitly declaring implementation -----------------------------------
To explicitly declare that a certain class implements the given protocols,
Why is this necessary? The whole point of ducktyping is that you *don't* have to declare what you implement.
I get that it looks convenient to have your protocol A also supply some of the methods you'd expect classes of type A to have. But completing an implementation in that way should be done explicitly (via including a utility class or using a decorator like functools.total_ordering), not as side-effect of an (unnecessary) protocol declaration.
I would put the question backwards: do we need to *prohibit* explicit subclassing? I think we shouldn't. Mostly for two reasons: 1. Backward compatibility: People are already using ABCs, including generic ABCs from typing module. If we prohibit explicit subclassing of these ABCs, then quite a lot of code will break. 2. Convenience: There are existing protocol-like ABCs (that will be turned into protocols) that have many useful "mix-in" (non-abstract) methods. For example in case of Sequence one only needs to implement __getitem__ and __len__ in an explicit subclass, and one gets __iter__, __contains__, __reversed__, index, and count for free. Another example is Mapping, one needs to implement only __getitem__, __len__, and __iter__, and one gets __contains__, keys, items, values, get, __eq__, and __ne__ for free. If you think it makes sense to add a note that implicit subtyping is preferred (especially for user defined protocols), then this certainly could be done. -- Ivan
On Mon, Mar 20, 2017 at 2:42 PM, Ivan Levkivskyi <levkivskyi@gmail.com> wrote:
1. Backward compatibility: People are already using ABCs, including generic ABCs from typing module. If we prohibit explicit subclassing of these ABCs, then quite a lot of code will break.
Fair enough. Backwards compatibility is a valid point, and both abc.py and typing.py have classes that lend itself to becoming protocols. The one thing that isn't clear to me is how type checkers will distinguish between 1.) Protocol methods in A that need to implemented in B so that B is considered a structural subclass of A. 2.) Extra methods you get for free when you explicitly inherit from A. To provide a more concrete example: Since Mapping implements __eq__, do I also have to implement __eq__ if I want my class to be (structurally) compatible with Mapping?
If you think it makes sense to add a note that implicit subtyping is preferred (especially for user defined protocols), then this certainly could be done.
Yes, I believe it would be good to mention that.
On 21 March 2017 at 17:09, Matthias Kramm <kramm@google.com> wrote:
The one thing that isn't clear to me is how type checkers will distinguish between 1.) Protocol methods in A that need to implemented in B so that B is considered a structural subclass of A. 2.) Extra methods you get for free when you explicitly inherit from A.
To provide a more concrete example: Since Mapping implements __eq__, do I also have to implement __eq__ if I want my class to be (structurally) compatible with Mapping?
An implicit subtype should implement all methods, so that yes, in this case __eq__ should be implemented for Mapping. There was an idea to make some methods "non-protocol" (i.e. not necessary to implement), but it was rejected, since this complicates things. Briefly, consider this function: def fun(m: Mapping): m.keys() The question is should this be an error? I think most people would expect this to be valid. The same applies to most other methods in Mapping, people expect that they are provided my Mapping. Therefore, to be on the safe side, we need to require these methods to be implemented. If you look at definitions in collections.abc, there are very few methods that could be considered "non-protocol". Therefore, it was decided to not introduce "non-protocol" methods. There is only one downside for this: it will require some boilerplate for implicit subtypes of Mapping etc. But, this applies to few "built-in" protocols (like Mapping and Sequence) and people already subclass them. Also, such style will be discouraged for user defined protocols. It will be recommended to create compact protocols and combine them. (This was discussed, but it looks like we forgot to add an explicit statement about this.) I will add a section on non-protocol methods to rejected/postponed ideas. -- Ivan
Technically, `__eq__` is implemented by `object` so a `Mapping` implementation that didn't implement it would still be considered valid. But probably not very useful (since the default implementation in this case is implemented by comparing object identity). On Tue, Mar 21, 2017 at 9:36 AM, Ivan Levkivskyi <levkivskyi@gmail.com> wrote:
On 21 March 2017 at 17:09, Matthias Kramm <kramm@google.com> wrote:
The one thing that isn't clear to me is how type checkers will distinguish between 1.) Protocol methods in A that need to implemented in B so that B is considered a structural subclass of A. 2.) Extra methods you get for free when you explicitly inherit from A.
To provide a more concrete example: Since Mapping implements __eq__, do I also have to implement __eq__ if I want my class to be (structurally) compatible with Mapping?
An implicit subtype should implement all methods, so that yes, in this case __eq__ should be implemented for Mapping.
There was an idea to make some methods "non-protocol" (i.e. not necessary to implement), but it was rejected, since this complicates things. Briefly, consider this function:
def fun(m: Mapping): m.keys()
The question is should this be an error? I think most people would expect this to be valid. The same applies to most other methods in Mapping, people expect that they are provided my Mapping. Therefore, to be on the safe side, we need to require these methods to be implemented. If you look at definitions in collections.abc, there are very few methods that could be considered "non-protocol". Therefore, it was decided to not introduce "non-protocol" methods.
There is only one downside for this: it will require some boilerplate for implicit subtypes of Mapping etc. But, this applies to few "built-in" protocols (like Mapping and Sequence) and people already subclass them. Also, such style will be discouraged for user defined protocols. It will be recommended to create compact protocols and combine them. (This was discussed, but it looks like we forgot to add an explicit statement about this.)
I will add a section on non-protocol methods to rejected/postponed ideas.
-- Ivan
_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/ guido%40python.org
-- --Guido van Rossum (python.org/~guido)
On Tue, 21 Mar 2017 at 09:17 Matthias Kramm via Python-Dev < python-dev@python.org> wrote:
On Mon, Mar 20, 2017 at 2:42 PM, Ivan Levkivskyi <levkivskyi@gmail.com> wrote:
1. Backward compatibility: People are already using ABCs, including generic ABCs from typing module. If we prohibit explicit subclassing of these ABCs, then quite a lot of code will break.
Fair enough. Backwards compatibility is a valid point, and both abc.py and typing.py have classes that lend itself to becoming protocols.
Another key point is that if you block subclassing then this stops being useful to anyone not using a type checker. While the idea of protocols is to support structural typing, there is nothing wrong with having nominal typing through ABCs help enforce the structural typing of a subclass at the same time. You could argue that if you want that you define the base ABC first and then have a class that literally does nothing but inherit from that base ABC and Protocol, but that's unnecessary duplication in an API to have the structural type and nominal type separate when we have a mechanism that can support both.
The one thing that isn't clear to me is how type checkers will distinguish between 1.) Protocol methods in A that need to implemented in B so that B is considered a structural subclass of A. 2.) Extra methods you get for free when you explicitly inherit from A.
To provide a more concrete example: Since Mapping implements __eq__, do I also have to implement __eq__ if I want my class to be (structurally) compatible with Mapping?
If you think it makes sense to add a note that implicit subtyping is preferred (especially for user defined protocols), then this certainly could be done.
Yes, I believe it would be good to mention that.
I don't think it needs to be explicitly discouraged if you want to make sure you implement the abstract methods (ABCs are useful for a reason). I do think it's fine, though, to make it very clear that whether you subclass or not makes absolutely no difference to tools validating the type soundness of the code.
On 21 March 2017 at 18:03, Brett Cannon <brett@python.org> wrote:
I do think it's fine, though, to make it very clear that whether you subclass or not makes absolutely no difference to tools validating the type soundness of the code.
There are two places where PEP draft says: "Note that there is no conceptual difference between explicit and implicit subtypes" and "The general philosophy is that protocols are mostly like regular ABCs, but a static type checker will handle them specially." Do you want to propose alternative wording for these, or would you rather like an additional statement? -- Ivan
On Tue, Mar 21, 2017 at 11:05 AM, Ivan Levkivskyi <levkivskyi@gmail.com> wrote:
There are two places where PEP draft says:
"Note that there is no conceptual difference between explicit and implicit subtypes"
and
"The general philosophy is that protocols are mostly like regular ABCs, but a static type checker will handle them specially."
Do you want to propose alternative wording for these, or would you rather like an additional statement?
Let's do an additional statement. Something like "Static analysis tools are expected to automatically detect that a class implements a given protocol. So while it's possible to subclass a protocol explicitly, it's not necessary to do so for the sake of type-checking."
On Mar 20, 2017, at 01:00 PM, Ivan Levkivskyi wrote:
from zope.interface import Interface, Attribute, implements
class IEmployee(Interface):
name = Attribute("Name of employee")
def do(work): """Do some work"""
class Employee(object): implements(IEmployee)
IIUC, the Python 3 way to spell this is with a decorator. from zope.interface import implementer @implementer(IEmployee) class Employee: (also, since this is Python 3, do you really need to inherit from object?) Cheers, -Barry
On 21 March 2017 at 00:23, Barry Warsaw <barry@python.org> wrote:
On Mar 20, 2017, at 01:00 PM, Ivan Levkivskyi wrote:
[SNIP]
IIUC, the Python 3 way to spell this is with a decorator.
Thanks, I will update this.
[SNIP] (also, since this is Python 3, do you really need to inherit from object?)
Indeed. -- Ivan
participants (6)
-
Barry Warsaw
-
Brett Cannon
-
Guido van Rossum
-
Ivan Levkivskyi
-
Matthias Kramm
-
Random832