Thank you, John. I agree that adding first-class ADTs support is a major thing. First of all, right now this is a real problem. Some example code I took from my real project (https://github.com/dry-python/returns): ``` from typing import Union class S: def method(self) -> int: ... class F: def method(self) -> int: ... Result1 = Union[S, F]reveal_type(Result1)# note: Revealed type is "builtins.object"reveal_type(Result1.method)# error: "object" has no attribute "method"# note: Revealed type is "Any" Result2: Union[S, F] # error: Variable "ex.Result2" is not valid as a type# note: See https://mypy.readthedocs.io/en/stable/common_issues.html#variables-vs-type-a... some() -> Result2: ... ``` This code creates a base `Result` type with several user-facing methods and only two possible subclasses: Success and Failure. Right now, this is complicated. You can create two different objects: `Result` base class with actual methods and some kind of `ResultType = Union[Success, Failure]` to enable exhaustive pattern matching. This is not friendly at all. See related discussion: https://github.com/dry-python/returns/issues/1361 But, the current proposal is limited to a single submodule. I see this as a major weakness and quite unintuitive thing to teach. I think that `__sealed__: Success | Failure` rejected idea might be actually better. Since we have `from __future__ import annotations` and string-based annotations this works in any case. Cycle imports can be suppressed with `if TYPE_CHECKING:`. Pros: - It will solve the initial problem - We are not limited to any module structure - It is easy to teach: we have similar class-level magic things, like `__slots__` - It will support runtime introspection out-of-the-box - We don't need to implement anything on CPython's side (except some new docs about `__sealed__`) So, my example becomes: ``` from __future__ import annotations class Result: __sealed__: Success | Failure def bind(self, function) -> Result: ... class Success(Result): ... class Failure(Result): ... ``` Best, Nikita сб, 7 мая 2022 г. в 20:24, John Hagen <johnthagen@gmail.com>:
David and I are pleased to share a draft PEP. We welcome any additional feedback.
It can be viewed rendered at: https://github.com/johnthagen/sealed-typing-pep
PEP: <REQUIRED: pep number> Title: Adding a sealed qualifier to typing Author: John Hagen <johnthagen@gmail.com>, David Hagen <david@drhagen.com> Sponsor: Jelle Zijlstra PEP-Delegate: <PEP delegate's real name> Discussions-To: typing-sig@python.org Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 17-Apr-2022 Python-Version: 3.12 Post-History: Resolution: <url>
Abstract ========
This PEP proposes a ``@sealed`` decorator be added to the ``typing`` module to support creating versatile algebraic data types (ADTs) which type checkers can exhaustively pattern match against.
Motivation ==========
Quite often it is desirable to apply exhaustiveness to a set of classes without defining ad-hoc union types, which is itself fragile if a class is missing in the union definition. A design pattern where a group of record-like classes is combined into a union is popular in other languages that support pattern matching [1]_ and is known as a nominal sum type, a key instatiation of algebraic data types [2]_.
We propose adding a special decorator class ``@sealed`` to the ``typing`` module [3]_, that will have no effect at runtime, but will indicate to static type checkers that all direct subclasses of this class should be defined in the same module as the base class.
The idea is that, since all subclasses are known, the type checker can treat the sealed base class as a union of all its subclasses. Together with dataclasses this allows a clean and safe support of algebraic data types in Python. Consider this example,
.. code-block:: python
from dataclasses import dataclass from typing import sealed
@sealed class Node: ...
@sealed class Expression(Node): ...
@sealed class Statement(Node): ...
@dataclass class Name(Expression): name: str
@dataclass class Operation(Expression): left: Expression op: str right: Expression
@dataclass class Assignment(Statement): target: str value: Expression
@dataclass class Print(Statement): value: Expression
With such a definition, a type checker can safely treat ``Node`` as ``Union[Expression, Statement]``, and also safely treat ``Expression`` as ``Union[Name, Operation]`` and ``Statement`` as ``Union[Assignment, Print]``. With these declarations, a type checking error will occur in the below snippet, because ``Name`` is not handled (and the type checker can give a useful error message).
.. code-block:: python
def dump(node: Node) -> str: match node: case Assignment(target, value): return f"{target} = {dump(value)}" case Print(value): return f"print({dump(value)})" case Operation(left, op, right): return f"({dump(left)} {op} {dump(right)})"
Note: This section was largely derived from PEP 622 [4]_.
Rationale =========
Kotlin [5]_, Scala 2 [6]_, and Java 17 [7]_ all support a ``sealed`` keyword that is used to declare algebraic data types. By using the same terminology, the ``@sealed`` decorator will be familiar to developers familiar with those languages.
Specification =============
The ``typing.sealed`` decorator can be applied to the declaration of any class. This decoration indicates to type checkers that all immediate subclasses of the decorated class are defined in the current file.
The exhaustiveness checking features of type checkers should assume that there are no subclasses outside the current file, treating the decorated class as a ``Union`` of all its same-file subclasses.
Type checkers should raise an error if a sealed class is inherited in a file different from where the sealed class is declared.
A sealed class is automatically declared to be abstract. Whatever actions a type checker normally takes with abstract classes should be taken with sealed classes as well. What exactly these behaviors are (e.g. disallowing instantiation) is outside the scope of this PEP.
Similar to the ``typing.final`` decorator [8]_, the only runtime behavior of this decorator is to set the ``__sealed__`` attribute of class to ``True`` so that the sealed property of the class can be introspected. There is no runtime enforcement of sealed class inheritance.
Reference Implementation ========================
[Link to any existing implementation and details about its state, e.g. proof-of-concept.]
Rejected Ideas ==============
Generalize ``Enum`` -------------------
Rust [9]_, Scala 3 [10]_, and Swift [11]_ support algebraic data types using a generalized ``enum`` mechanism.
.. code-block:: rust
enum Message { Quit, Move { x: i32, y: i32 }, Write(String), ChangeColor(i32, i32, i32), }
One could imagine a generalization of the Python ``Enum`` [12]_ to support variants of different shapes. But given that the Python ``Enum`` is more or less a normal class, with some magic internals, this would be a much more invasive change.
.. code-block:: python
from dataclasses import dataclass from enum import Enum
class Message(Enum): @dataclass class Quit: ...
@dataclass class Move: x: int y: int
@dataclass class Write: message: str
@dataclass class ChangeColor: r: int g: int b: int
Explicitly list subclasses --------------------------
Java requires that subclasses be explicitly listed with the base class.
.. code-block:: java
public sealed interface Node permits Leaf, Branch {}
public final class Leaf {} public final class Branch {}
The advantage of this requirement is that subclasses can be defined anywhere, not just in the same file, eliminating the somewhat weird file dependence of this feature. Once disadvantage is that requires that all subclasses to be written twice: once when defined and once in the enumerated list on the base class.
There is also an inherent circular reference when explicitly enumerating the subclasses. The subclass refers to the base class in order to inherit from it, and the base class refers to the subclasses in order to enumerate them. In statically typed languages, these kinds of circular references in the types can be managed, but in Python, it is much harder.
For example, this ``Sealed`` base class that behaves like ``Generic``:
.. code-block:: python
from typing import Sealed
class Node(Sealed[Leaf, Branch]): ...
class Leaf(Node): ... class Branch(Node): ...
This cannot work because ``Leaf`` must be defined before ``Node`` and ``Node`` must be defined before ``Leaf``. This is a not an annotation, so lazy annotations cannot save it. Perhaps, the subclasses in the enumerated list could be strings, but that severely hurts the ergonomics of this feature.
If the enumerated list was in an annotation, it could be made to work, but there is no natural place for the annotation to live. Here is one possibility:
.. code-block:: python
class Node: __sealed__: Leaf | Branch
class Leaf(Node): ... class Branch(Node): ...
``Union`` of independent variants ---------------------------------
Some of the behavior of ``sealed`` can be emulated with ``Union`` today.
.. code-block:: python
class Leaf: ... class Branch: ...
Node = Leaf | Branch
The main problem with this is that the ADT loses all the features of inheritance, which is rather featureful in Python, to put it mildly. There can be no abstract methods, private methods to be reused by the subclasses, public methods to be exposed on all subclasses, ``__init_subclass__``, etc. Even if a specific method is implemented on each subclass, then rename, jump-to-definition, find-usage, and other IDE features are difficult to make work reliably.
Adding a base class in addition to the union type alleviates some of these issues:
.. code-block:: python
class BaseNode: ...
class Leaf(BaseNode): ... class Branch(BaseNode): ...
Node = Leaf | Branch
Despite being possible today, this is quite unergonomic. The base class and the union type are conceptually the same thing, but have to be defined as two separate objects. If this became standard, it seems Python would be first language to separate the definition of an ADT into two different objects.
The base class is not merely passive, either. There are a number of operations that will only work when using the base class instead of the union type. For example, matching only works on the base class, not the union type:
.. code-block:: python
maybe_node: Node | None = ... # must be Node to enforce exhaustiveness
match maybe_node: case Node(): # TypeError: called match pattern must be a type ... case None: ...
match maybe_node: case BaseNode(): # no error ... case None: ...
Having to remember whether to use the base class or the union type in each situation is particularly unfriendly to the user of a sealed class.
Footnotes =========
.. [1] https://en.wikipedia.org/wiki/Pattern_matching
.. [2] https://en.wikipedia.org/wiki/Algebraic_data_type
.. [3] https://docs.python.org/3/library/typing.html
.. [4]
https://peps.python.org/pep-0622/#sealed-classes-as-algebraic-data-types
.. [5] https://kotlinlang.org/docs/sealed-classes.html
.. [6] https://docs.scala-lang.org/tour/pattern-matching.html
.. [7] https://openjdk.java.net/jeps/409
.. [8] https://peps.python.org/pep-0591/
.. [9] https://doc.rust-lang.org/book/ch06-01-defining-an-enum.html
.. [10] https://docs.scala-lang.org/scala3/reference/enums/adts.html
.. [11] https://docs.swift.org/swift-book/LanguageGuide/Enumerations.html
.. [12] https://docs.python.org/3/library/enum.html
Copyright =========
This document is placed in the public domain. _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: n.a.sobolev@gmail.com