PEP 484 (Type Hints) -- second draft
Here's an updated draft for PEP 484. There's more to do (especially generic
types need much more thought and writing) but I've done a bunch of editing
and thinking so I think this is ready for another round of review.
Remember, for technical questions it's often best to use the GitHub tracker
for this PEP at https://github.com/ambv/typehinting/issues .
Significant changes in this draft:
- Define stubs.
- Define `@overload`.
- Describe `cast()`.
- Fix description of `Any`.
- Describe `Callable[..., t]`.
- Explain why `List[t]` instead of `List<t>`.
- Add section on rejected alternatives.
- Various other edits for clarity.
Incomplete list of TODOs:
- Define and explain generics.
- Covariance vs. contravariance (
https://github.com/ambv/typehinting/issues/2)
- Other edits as indicated by FIXME comments.
- Other edits mentioned in
https://github.com/ambv/typehinting/blob/master/README.rst .
Here's the full text of the PEP (hopefully it will appear soon at
https://www.python.org/dev/peps/pep-0484/):
PEP: 484
Title: Type Hints
Version: $Revision$
Last-Modified: $Date$
Author: Guido van Rossum
From the Python parser's perspective, the expression begins with the same four tokens (NAME, LESS, NAME, GREATER) as a chained comparison::
a < b > c # I.e., (a < b) and (b > c) We can even make up an example that could be parsed both ways:: a < b > [ c ] Assuming we had angular brackets in the language, this could be interpreted as either of the following two:: (a<b>)[c] # I.e., (a<b>).__getitem__(c) a < b > ([c]) # I.e., (a < b) and (b > [c]) It would surely be possible to come up with a rule to disambiguate such cases, but to most users the rules would feel arbitrary and complex. It would also require us to dramatically change the CPython parser (and every other parser for Python). It should be noted that Python's current parser is intentionally "dumb" -- a simple grammar is easier for users to reason about. For all these reasons, square brackets (e.g. ``List[int]``) are (and have long been) the preferred syntax for generic type parameters. They can be implemented by defining the ``__getitem__()`` method on the metaclass, and no new syntax is required at all. This option works in all recent versions of Python (starting with Python 2.2). Python is not alone in this syntactic choice -- generic classes in Scala also use square brackets. What about existing uses of annotations? ---------------------------------------- One line of argument points out that PEP 3107 explicitly supports the use of arbitrary expressions in function annotations. The new proposal is then considered incompatible with the specification of PEP 3107. Our response to this is that, first of all, the current proposal does not introduce any direct incompatibilities, so programs using annotations in Python 3.4 will still work correctly and without prejudice in Python 3.5. We do hope that type hints will eventually become the sole use for annotations, but this will require additional discussion and a deprecation period after the initial roll-out of the typing module with Python 3.5. The current PEP will have provisional status (see PEP 411) until Python 3.6 is released. The fastest conceivable scheme would introduce silent deprecation of non-type-hint annotations in 3.6, full deprecation in 3.7, and declare type hints as the only allowed use of annotations in Python 3.8. This should give authors of packages that use annotations plenty of time to devise another approach, even if type hints become an overnight success. Another possible outcome would be that type hints will eventually become the default meaning for annotations, but that there will always remain an option to disable them. For this purpose the current proposal defines a decorator ``@no_type_check`` which disables the default interpretation of annotations as type hints in a given class or function. It also defines a meta-decorator ``@no_type_check_decorator`` which can be used to decorate a decorator (!), causing annotations in any function or class decorated with the latter to be ignored by the type checker. There are also ``# type: ignore`` comments, and static checkers should support configuration options to disable type checking in selected packages. Despite all these options, proposals have been circulated to allow type hints and other forms of annotations to coexist for individual arguments. One proposal suggests that if an annotation for a given argument is a dictionary literal, each key represents a different form of annotation, and the key ``'type'`` would be use for type hints. The problem with this idea and its variants is that the notation becomes very "noisy" and hard to read. Also, in most cases where existing libraries use annotations, there would be little need to combine them with type hints. So the simpler approach of selectively disabling type hints appears sufficient. The problem of forward declarations ----------------------------------- The current proposal is admittedly sub-optimal when type hints must contain forward references. Python requires all names to be defined by the time they are used. Apart from circular imports this is rarely a problem: "use" here means "look up at runtime", and with most "forward" references there is no problem in ensuring that a name is defined before the function using it is called. The problem with type hints is that annotations (per PEP 3107, and similar to default values) are evaluated at the time a function is defined, and thus any names used in an annotation must be already defined when the function is being defined. A common scenario is a class definition whose methods need to reference the class itself in their annotations. (More general, it can also occur with mutually recursive classes.) This is natural for container types, for example:: class Node: """Binary tree node.""" def __init__(self, left: Node, right: None): self.left = left self.right = right As written this will not work, because of the peculiarity in Python that class names become defined once the entire body of the class has been executed. Our solution, which isn't particularly elegant, but gets the job done, is to allow using string literals in annotations. Most of the time you won't have to use this though -- most _uses_ of type hints are expected to reference builtin types or types defined in other modules. A counterproposal would change the semantics of type hints so they aren't evaluated at runtime at all (after all, type checking happens off-line, so why would type hints need to be evaluated at runtime at all). This of course would run afoul of backwards compatibility, since the Python interpreter doesn't actually know whether a particular annotation is meant to be a type hint or something else. The double colon ---------------- A few creative souls have tried to invent solutions for this problem. For example, it was proposed to use a double colon (``::``) for type hints, solving two problems at once: disambiguating between type hints and other annotations, and changing the semantics to preclude runtime evaluation. There are several things wrong with this idea, however. * It's ugly. The single colon in Python has many uses, and all of them look familiar because they resemble the use of the colon in English text. This is a general rule of thumb by which Python abides for most forms of punctuation; the exceptions are typically well known from other programming languages. But this use of ``::`` is unheard of in English, and in other languages (e.g. C++) it is used as a scoping operator, which is a very different beast. In contrast, the single colon for type hints reads natural -- and no wonder, since it was carefully designed for this purpose (the idea long predates PEP 3107 [gvr-artima]_). It is also used in the same fashion in other languages from Pascal to Swift. * What would you do for return type annotations? * It's actually a feature that type hints are evaluated at runtime. * Making type hints available at runtime allows runtime type checkers to be built on top of type hints. * It catches mistakes even when the type checker is not run. Since it is a separate program, users may choose not to run it (or even install it), but might still want to use type hints as a concise form of documentation. Broken type hints are no use even for documentation. * Because it's new syntax, using the double colon for type hints would limit them to code that works with Python 3.5 only. By using existing syntax, the current proposal can easily work for older versions of Python 3. (And in fact mypy supports Python 3.2 and newer.) * If type hints become successful we may well decide to add new syntax in the future to declare the type for variables, for example ``var age: int = 42``. If we were to use a double colon for argument type hints, for consistency we'd have to use the same convention for future syntax, perpetuating the ugliness. Other forms of new syntax ------------------------- A few other forms of alternative syntax have been proposed, e.g. the introduction of a ``where`` keyword [roberge]_, and Cobra-inspired ``requires`` clauses. But these all share a problem with the double colon: they won't work for earlier versions of Python 3. The same would apply to a new ``__future__`` import. Other backwards compatible conventions -------------------------------------- The ideas put forward include: * A decorator, e.g. ``@typehints(name=str, returns=str)``. This could work, but it's pretty verbose (an extra line, and the argument names must be repeated), and a far cry in elegance from the PEP 3107 notation. * Stub files. We do want stub files, but they are primarily useful for adding type hints to existing code that doesn't lend itself to adding type hints, e.g. 3rd party packages, code that needs to support both Python 2 and Python 3, and especially extension modules. For most situations, having the annotations in line with the function definitions makes them much more useful. * Docstrings. There is an existing convention for docstrings, based on the Sphinx notation (``:type arg1: description``). This is pretty verbose (an extra line per parameter), and not very elegant. We could also make up something new, but the annotation syntax is hard to beat (because it was designed for this very purpose). It's also been proposed to simply wait another release. But what problem would that solve? It would just be procrastination. Is Type Hinting Pythonic? ========================= .. FIXME: Do we really need this section? Type annotations provide important documentation for how a unit of code should be used. Programmers should therefore provide type hints on public APIs, namely argument and return types on functions and methods considered public. However, because types of local and global variables can be often inferred, they are rarely necessary. The kind of information that type hints hold has always been possible to achieve by means of docstrings. In fact, a number of formalized mini-languages for describing accepted arguments have evolved. Moving this information to the function declaration makes it more visible and easier to access both at runtime and by static analysis. Adding to that the notion that “explicit is better than implicit”, type hints are indeed *Pythonic*. Acknowledgements ================ This document could not be completed without valuable input, encouragement and advice from Jim Baker, Jeremy Siek, Michael Matson Vitousek, Andrey Vlasovskikh, and Radomir Dopieralski. Influences include existing languages, libraries and frameworks mentioned in PEP 482. Many thanks to their creators, in alphabetical order: Stefan Behnel, William Edwards, Greg Ewing, Larry Hastings, Anders Hejlsberg, Alok Menghrajani, Travis E. Oliphant, Joe Pamer, Raoul-Gabriel Urma, and Julien Verlaguet. References ========== .. [mypy] http://mypy-lang.org .. [pyflakes] https://github.com/pyflakes/pyflakes/ .. [pylint] http://www.pylint.org .. [gvr-artima] http://www.artima.com/weblogs/viewpost.jsp?thread=85551 .. [roberge] http://aroberge.blogspot.com/2015/01/type-hinting-in-python-focus-on.html Copyright ========= This document has been placed in the public domain. .. Local Variables: mode: indented-text indent-tabs-mode: nil sentence-end-double-space: t fill-column: 70 coding: utf-8 End: -- --Guido van Rossum (python.org/~guido)
On Fri, Mar 20, 2015 at 09:59:32AM -0700, Guido van Rossum wrote:
Union types ----------- [...] As a shorthand for ``Union[T1, None]`` you can write ``Optional[T1]``;
That only saves three characters. Is it worth it?
An optional type is also automatically assumed when the default value is ``None``, for example::
def handle_employee(e: Employee = None): ...
Should that apply to all default values or just None? E.g. if I have def spam(s: str = 23): ... should that be inferred as Union[str, int] or be flagged as a type error? I think that we want a type error here, and it's only None that actually should be treated as special. Perhaps that should be made explicit in the PEP. [...]
For the purposes of type hinting, the type checker assumes ``__debug__`` is set to ``True``, in other words the ``-O`` command-line option is not used while type checking.
I'm afraid I don't understand what you are trying to say here. I would have expected that __debug__ and -O and the type checker would be independent of each other. [...]
To mark portions of the program that should not be covered by type hinting, use the following:
* a ``@no_type_check`` decorator on classes and functions
* a ``# type: ignore`` comment on arbitrary lines
.. FIXME: should we have a module-wide comment as well?
I think so, if for no other reason than it will reduce the fear of some people that type checks will be mandatory.
Type Hints on Local and Global Variables ========================================
No first-class syntax support for explicitly marking variables as being of a specific type is added by this PEP. To help with type inference in complex cases, a comment of the following format may be used::
x = [] # type: List[Employee]
In the case where type information for a local variable is needed before it is declared, an ``Undefined`` placeholder might be used::
from typing import Undefined
x = Undefined # type: List[Employee] y = Undefined(int)
How is that better than just bringing forward the variable declaration? x = [] # type: List[Employee] y = 0
Casts =====
Occasionally the type checker may need a different kind of hint: the programmer may know that an expression is of a more constrained type than the type checker infers. For example::
from typing import List
def find_first_str(a: List[object]) -> str: index = next(i for i, x in enumerate(a) if isinstance(x, str)) # We only get here if there's at least one string in a return cast(str, a[index])
The type checker infers the type ``object`` for ``a[index]``, but we know that (if the code gets to that point) it must be a string. The ``cast(t, x)`` call tells the type checker that we are confident that the type of ``x`` is ``t``.
Is the type checker supposed to unconditionally believe the cast, or only if the cast is more constrained than the infered type (like str and object, or bool and int)? E.g. if the type checker infers int, and the cast says list, I'm not entirely sure I would trust the programmer more than the type checker. My feeling here is that some type checkers will unconditionally trust the cast, and some will flag the mismatch, or offer a config option to swap between the two, and that will be a feature for type checkers to compete on. I'm also going to bike-shed the order of arguments. It seems to me that we want to say: cast(x, T) # pronounced "cast x to T" rather than Yoda-speak "cast T x to we shall" *wink*. That also matches the order of isinstance(obj, type) calls and makes it easier to remember.
At runtime a cast always returns the expression unchanged -- it does not check the type, and it does not convert or coerce the value.
I'm a little concerned about cast() being a function. I know that it's a do-nothing function, but there's still the overhead of the name lookup and function call. It saddens me that giving a hint to the type checker has a runtime cost, small as it is. (I know that *technically* annotations have a runtime cost too, but they're once-only, at function definition time, not every time you call the function.) Your point below that cast() can be used inside expressions is a valid point, so there has to be a cast() function to support those cases, but for the example given here where the cast occurs in a return statement, wouldn't a type comment do? return some_expression # type: T hints that some_expression is to be treated as type T, regardless of what was infered.
Casts differ from type comments (see the previous section). When using a type comment, the type checker should still verify that the inferred type is consistent with the stated type. When using a cast, the type checker trusts the programmer. Also, casts can be used in expressions, while type comments only apply to assignments.
Stub Files ========== [...] Stub files may use the ``.py`` extension or alternatively may use the ``.pyi`` extension. The latter makes it possible to maintain stub files in the same directory as the corresponding real module.
I don't like anything that could cause confusion between stub files and actual Python files. If we allow .py extension on stub files, I'm sure there will be confusing errors where people somehow manage to get the stub file imported instead of the actual module they want. Is there any advantage to allowing stub files use a .py extension? If not, then don't allow it. -- Steve
On Saturday, March 21, 2015, Steven D'Aprano
On Fri, Mar 20, 2015 at 09:59:32AM -0700, Guido van Rossum wrote:
Union types ----------- [...] As a shorthand for ``Union[T1, None]`` you can write ``Optional[T1]``;
That only saves three characters. Is it worth it?
I think it is worth it because will be a very common case, and the shortcut is more readable. cheers, Luciano
An optional type is also automatically assumed when the default value is ``None``, for example::
def handle_employee(e: Employee = None): ...
Should that apply to all default values or just None? E.g. if I have
def spam(s: str = 23): ...
should that be inferred as Union[str, int] or be flagged as a type error? I think that we want a type error here, and it's only None that actually should be treated as special. Perhaps that should be made explicit in the PEP.
[...]
For the purposes of type hinting, the type checker assumes ``__debug__`` is set to ``True``, in other words the ``-O`` command-line option is not used while type checking.
I'm afraid I don't understand what you are trying to say here. I would have expected that __debug__ and -O and the type checker would be independent of each other.
[...]
To mark portions of the program that should not be covered by type hinting, use the following:
* a ``@no_type_check`` decorator on classes and functions
* a ``# type: ignore`` comment on arbitrary lines
.. FIXME: should we have a module-wide comment as well?
I think so, if for no other reason than it will reduce the fear of some people that type checks will be mandatory.
Type Hints on Local and Global Variables ========================================
No first-class syntax support for explicitly marking variables as being of a specific type is added by this PEP. To help with type inference in complex cases, a comment of the following format may be used::
x = [] # type: List[Employee]
In the case where type information for a local variable is needed before it is declared, an ``Undefined`` placeholder might be used::
from typing import Undefined
x = Undefined # type: List[Employee] y = Undefined(int)
How is that better than just bringing forward the variable declaration?
x = [] # type: List[Employee] y = 0
Casts =====
Occasionally the type checker may need a different kind of hint: the programmer may know that an expression is of a more constrained type than the type checker infers. For example::
from typing import List
def find_first_str(a: List[object]) -> str: index = next(i for i, x in enumerate(a) if isinstance(x, str)) # We only get here if there's at least one string in a return cast(str, a[index])
The type checker infers the type ``object`` for ``a[index]``, but we know that (if the code gets to that point) it must be a string. The ``cast(t, x)`` call tells the type checker that we are confident that the type of ``x`` is ``t``.
Is the type checker supposed to unconditionally believe the cast, or only if the cast is more constrained than the infered type (like str and object, or bool and int)?
E.g. if the type checker infers int, and the cast says list, I'm not entirely sure I would trust the programmer more than the type checker.
My feeling here is that some type checkers will unconditionally trust the cast, and some will flag the mismatch, or offer a config option to swap between the two, and that will be a feature for type checkers to compete on.
I'm also going to bike-shed the order of arguments. It seems to me that we want to say:
cast(x, T) # pronounced "cast x to T"
rather than Yoda-speak "cast T x to we shall" *wink*. That also matches the order of isinstance(obj, type) calls and makes it easier to remember.
At runtime a cast always returns the expression unchanged -- it does not check the type, and it does not convert or coerce the value.
I'm a little concerned about cast() being a function. I know that it's a do-nothing function, but there's still the overhead of the name lookup and function call. It saddens me that giving a hint to the type checker has a runtime cost, small as it is.
(I know that *technically* annotations have a runtime cost too, but they're once-only, at function definition time, not every time you call the function.)
Your point below that cast() can be used inside expressions is a valid point, so there has to be a cast() function to support those cases, but for the example given here where the cast occurs in a return statement, wouldn't a type comment do?
return some_expression # type: T
hints that some_expression is to be treated as type T, regardless of what was infered.
Casts differ from type comments (see the previous section). When using a type comment, the type checker should still verify that the inferred type is consistent with the stated type. When using a cast, the type checker trusts the programmer. Also, casts can be used in expressions, while type comments only apply to assignments.
Stub Files ========== [...] Stub files may use the ``.py`` extension or alternatively may use the ``.pyi`` extension. The latter makes it possible to maintain stub files in the same directory as the corresponding real module.
I don't like anything that could cause confusion between stub files and actual Python files. If we allow .py extension on stub files, I'm sure there will be confusing errors where people somehow manage to get the stub file imported instead of the actual module they want.
Is there any advantage to allowing stub files use a .py extension? If not, then don't allow it.
-- Steve _______________________________________________ Python-ideas mailing list Python-ideas@python.org javascript:; https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
-- Luciano Ramalho | Author of Fluent Python (O'Reilly, 2015) | http://shop.oreilly.com/product/0636920032519.do | Professor em: http://python.pro.br | Twitter: @ramalhoorg
(This one took longer to respond to because there are so many nits. :-)
On Sat, Mar 21, 2015 at 8:37 AM, Steven D'Aprano
On Fri, Mar 20, 2015 at 09:59:32AM -0700, Guido van Rossum wrote:
Union types ----------- [...] As a shorthand for ``Union[T1, None]`` you can write ``Optional[T1]``;
That only saves three characters. Is it worth it?
Yes (see other messages in the thread).
An optional type is also automatically assumed when the default value is ``None``, for example::
def handle_employee(e: Employee = None): ...
Should that apply to all default values or just None? E.g. if I have
def spam(s: str = 23): ...
should that be inferred as Union[str, int] or be flagged as a type error? I think that we want a type error here, and it's only None that actually should be treated as special. Perhaps that should be made explicit in the PEP.
I only want this special treatment for None. The PEP seems to be pretty clear about this.
[...]
For the purposes of type hinting, the type checker assumes ``__debug__`` is set to ``True``, in other words the ``-O`` command-line option is not used while type checking.
I'm afraid I don't understand what you are trying to say here. I would have expected that __debug__ and -O and the type checker would be independent of each other.
Well, __debug__ tracks -O (__debug__ is false iff -O is given). But a type checker usually doesn't have the luxury to know whether the code is meant to run with -O or without it. So it assumes -O is *not* given, i.e. __debug__ is True. (You could try to type-check twice, once with -O and once without, but this idea doesn't really scale when you consider all the other flags that might be treated this way. And it's not really worth the trouble.)
[...]
To mark portions of the program that should not be covered by type hinting, use the following:
* a ``@no_type_check`` decorator on classes and functions
* a ``# type: ignore`` comment on arbitrary lines
.. FIXME: should we have a module-wide comment as well?
I think so, if for no other reason than it will reduce the fear of some people that type checks will be mandatory.
That fear ought to be reduced to zero by words in the PEP; have you got any suggestions? The mere presence of a module-wide comment to disable type checks might actually *increase* the fear that (absent such a comment) type checks might be mandatory. I could imagine other reasons for wanting a file-scoped directive, but I'm not sure -- there's a lot of discussion in https://github.com/ambv/typehinting/issues/35, maybe you can make sense of it.
Type Hints on Local and Global Variables ========================================
No first-class syntax support for explicitly marking variables as being of a specific type is added by this PEP. To help with type inference in complex cases, a comment of the following format may be used::
x = [] # type: List[Employee]
In the case where type information for a local variable is needed before it is declared, an ``Undefined`` placeholder might be used::
from typing import Undefined
x = Undefined # type: List[Employee] y = Undefined(int)
How is that better than just bringing forward the variable declaration?
x = [] # type: List[Employee] y = 0
The actual initialization might have to happen later, separately in different branches of an if-statement; or this might be a class variable. Jukka gave some more reasons for having Undefined in https://github.com/ambv/typehinting/issues/20
Casts =====
Occasionally the type checker may need a different kind of hint: the programmer may know that an expression is of a more constrained type than the type checker infers. For example::
from typing import List
def find_first_str(a: List[object]) -> str: index = next(i for i, x in enumerate(a) if isinstance(x, str)) # We only get here if there's at least one string in a return cast(str, a[index])
The type checker infers the type ``object`` for ``a[index]``, but we know that (if the code gets to that point) it must be a string. The ``cast(t, x)`` call tells the type checker that we are confident that the type of ``x`` is ``t``.
Is the type checker supposed to unconditionally believe the cast, or only if the cast is more constrained than the infered type (like str and object, or bool and int)?
E.g. if the type checker infers int, and the cast says list, I'm not entirely sure I would trust the programmer more than the type checker.
My feeling here is that some type checkers will unconditionally trust the cast, and some will flag the mismatch, or offer a config option to swap between the two, and that will be a feature for type checkers to compete on.
It should unconditionally believe the cast. (Reference: https://github.com/ambv/typehinting/issues/15#issuecomment-69136820)
I'm also going to bike-shed the order of arguments. It seems to me that we want to say:
cast(x, T) # pronounced "cast x to T"
rather than Yoda-speak "cast T x to we shall" *wink*. That also matches the order of isinstance(obj, type) calls and makes it easier to remember.
Seems you're not alone here. :-) I've opened https://github.com/ambv/typehinting/issues/63
At runtime a cast always returns the expression unchanged -- it does not check the type, and it does not convert or coerce the value.
I'm a little concerned about cast() being a function. I know that it's a do-nothing function, but there's still the overhead of the name lookup and function call. It saddens me that giving a hint to the type checker has a runtime cost, small as it is.
(I know that *technically* annotations have a runtime cost too, but they're once-only, at function definition time, not every time you call the function.)
Your point below that cast() can be used inside expressions is a valid point, so there has to be a cast() function to support those cases, but for the example given here where the cast occurs in a return statement, wouldn't a type comment do?
return some_expression # type: T
hints that some_expression is to be treated as type T, regardless of what was infered.
That has slightly different semantics -- while the type checker should unconditionally believe cast(), #type: comments are required to be consistent. In the example from the PEP (find_first_str()) the cas() is required, since the derived type is object.
Casts differ from type comments (see the previous section). When using a type comment, the type checker should still verify that the inferred type is consistent with the stated type. When using a cast, the type checker trusts the programmer. Also, casts can be used in expressions, while type comments only apply to assignments.
Stub Files ========== [...] Stub files may use the ``.py`` extension or alternatively may use the ``.pyi`` extension. The latter makes it possible to maintain stub files in the same directory as the corresponding real module.
I don't like anything that could cause confusion between stub files and actual Python files. If we allow .py extension on stub files, I'm sure there will be confusing errors where people somehow manage to get the stub file imported instead of the actual module they want.
Is there any advantage to allowing stub files use a .py extension? If not, then don't allow it.
I'm tracking this at https://github.com/ambv/typehinting/issues/64 now. -- --Guido van Rossum (python.org/~guido)
participants (3)
-
Guido van Rossum
-
Luciano Ramalho
-
Steven D'Aprano