Re: [Python-Dev] Status of PEP 484 and the typing module
Hi Mark, We're down to the last few items here. I'm CC'ing python-dev so folks can see how close we are. I'll answer point by point. On Thu, May 21, 2015 at 6:24 AM, Mark Shannon wrote:
Hi,
The PEP itself is looking fairly good.
I hope you'll accept it at least provisionally so we can iterate over the finer points while a prototype of typing.py in in beta 1.
However, I don't think that typing.py is ready yet, for a number of reasons:
1. As I've said before, there needs to be a distinction between classes and types. They is no need for Any, Generic, Generic's subtypes, or Union to subclass builtins.type.
I strongly disagree. They can appear in many positions where real classes are acceptable, in particular annotations can have classes (e.g. int) or types (e.g. Union[int, str]).
Playing around with typing.py, it has also become clear to me that it is also important to distinguish type constructors from types.
What do I mean by a type constructor? A type constructor makes types. "List" is an example of a type constructor. It constructs types such as List[T] and List[int]. Saying that something is a List (as opposed to a list) should be rejected.
The PEP actually says that plain List (etc.) is equivalent to List[Any]. (Well, at least that's the intention; it's implied by the section about the equivalence between Node() and Node[Any]().
2. Usability of typing as it stands:
Let's try to make a class that implements a mutable mapping.
import typing as tp #Make some variables. T = tp.TypeVar('T') K = tp.TypeVar('K') V = tp.TypeVar('V')
#Then make our class:
class MM(tp.MutableMapping): pass ... #Oh that worked, but it shouldn't. MutableMapping is a type constructor.
It means MutableMapping[Any].
#Let's make one
MM() Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/home/mark/repositories/typehinting/prototyping/typing.py", line 1095, in __new__ if _gorg(c) is Generic: File "/home/mark/repositories/typehinting/prototyping/typing.py", line 887, in _gorg while a.__origin__ is not None: AttributeError: type object 'Sized' has no attribute '__origin__'
# ???
Sorry, that's a bug I introduced in literally the last change to typing.py. I will fix it. The expected behavior is TypeError: Can't instantiate abstract class MM with abstract methods __len__
#Well let's try using type variables. class MM2(tp.MutableMapping[K, V]): pass ...
MM2() Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/home/mark/repositories/typehinting/prototyping/typing.py", line 1095, in __new__ if _gorg(c) is Generic: File "/home/mark/repositories/typehinting/prototyping/typing.py", line 887, in _gorg while a.__origin__ is not None: AttributeError: type object 'Sized' has no attribute '__origin__'
Ditto, and sorry.
At this point, we have to resort to using 'Dict', which forces us to subclass 'dict' which may not be what we want as it may cause metaclass conflicts.
3. Memory consumption is also a worry. There is no caching, which means every time I use "List[int]" as an annotation, a new class object is created. Each class may only be a few KB, but collectively this could easily add up to several MBs. This should be easy to fix.
I can work on this after the beta-1 release. Until then, type aliases can be used to avoid redundant type creation (and often they are clearer anyway :-).
4. PY2, etc. really need to go. Assuming that this code type checks OK:
if typing.PY2: type_safe_under_py2_only() else: type_safe_under_py3_only()
Is the checker supposed to pass this:
if sys.hexversion < 0x03000000: type_safe_under_py2_only() else: type_safe_under_py3_only()
If it should pass, then why have PY2, etc. at all. If it should fail, well that is just stupid and annoying.
Pylint already understands version checks, as does our (Semmle's) checker. I suspect most IDEs do as well.
I have to negotiate this with Jukka but I think he'll agree.
5. Removing isinstance() support:
As I said before, this is the job of a checker not typing.py.
It also introduces some strange situations: D = tp.Dict[str,int] d = {} assert isinstance(d, D) d["x"] = None assert isinstance(d, D)
In the above case the first check passes, and the second fails. But d is either of type D or it isn't. It can't be both, as types are static properties of programs, unlike classes.
Well, isinstance() is a dynamic function. The type checker has no authority over its behavior beyond its signature.
And it's broken anyway:
D = tp.Dict[str,'D'] d = {"x": {}} isinstance(d, D) False
That's because _ForwardRef doesn't implement __instancheck__ or __subclasscheck__. It's easily fixed.
Realistically, I don't see typing.py being ready in time for 3.5. I'd be happy to be proved wrong.
Cheers, Mark.
P.S. I am worried by the lack of formal specification. It all seems a bit hand-waving. A formal spec reduces the likelihood of some unforeseen corner case being a permanent wart.
Formal specs are not my cup of tea. :-( (I'm not proud of this, but it just is a fact -- see how terrible a job I've done of the Python reference manual.) The best I could come up with is PEP 483.
Take the recursive type above. There is no mention of recursive types in the PEP and they are clearly possible. Are they allowed?
They should be allowed. I imagine you could create one for which a naive isinstance() imeplementation ends up in an infinite loop. That can be fixed too (we fixed this for printing self-referential lists and dicts).
I'm guessing that Jukka's thesis should cover a lot of this. Has it been published yet?
Hopefully Jukka can answer that. :-) -- --Guido van Rossum (python.org/~guido)
On 21/05/15 16:01, Guido van Rossum wrote:
Hi Mark,
We're down to the last few items here. I'm CC'ing python-dev so folks can see how close we are. I'll answer point by point.
On Thu, May 21, 2015 at 6:24 AM, Mark Shannon mailto:mark@hotpy.org> wrote:
Hi,
The PEP itself is looking fairly good.
I hope you'll accept it at least provisionally so we can iterate over the finer points while a prototype of typing.py in in beta 1.
However, I don't think that typing.py is ready yet, for a number of reasons:
1. As I've said before, there needs to be a distinction between classes and types. They is no need for Any, Generic, Generic's subtypes, or Union to subclass builtins.type.
I strongly disagree. They can appear in many positions where real classes are acceptable, in particular annotations can have classes (e.g. int) or types (e.g. Union[int, str]).
Why does this mean that they have to be classes? Annotations can be any object. It might to help to think, not in terms of types being classes, but classes being shorthand for the nominal type for that class (from the point of view of the checker and type geeks) So when the checker sees 'int' it treats it as Type(int). Subtyping is distinct from subclassing; Type(int) <: Union[Type(int), Type(str)] has no parallel in subclassing. There is no class that corresponds to a Union, Any or a Generic. In order to support the class C(ParameterType[T]): pass syntax, parametric types do indeed need to be classes, but Python has multiple inheritance, so thats not a problem: class ParameterType(type, Type): ... Otherwise typing.Types shouldn't be builtin.types and vice versa. I think a lot of this issues on the tracker would not have been issues had the distinction been more clearly enforced.
Playing around with typing.py, it has also become clear to me that it is also important to distinguish type constructors from types.
What do I mean by a type constructor? A type constructor makes types. "List" is an example of a type constructor. It constructs types such as List[T] and List[int]. Saying that something is a List (as opposed to a list) should be rejected.
The PEP actually says that plain List (etc.) is equivalent to List[Any]. (Well, at least that's the intention; it's implied by the section about the equivalence between Node() and Node[Any]().
Perhaps we should change that. Using 'List', rather than 'list' or 'List[Any]' suggests an error, or misunderstanding, to me. Is there a use case where 'List' is needed, and 'list' will not suffice? I'm assuming that the type checker knows that 'list' is a MutableSequence.
2. Usability of typing as it stands:
Let's try to make a class that implements a mutable mapping.
>>> import typing as tp #Make some variables. >>> T = tp.TypeVar('T') >>> K = tp.TypeVar('K') >>> V = tp.TypeVar('V')
#Then make our class:
>>> class MM(tp.MutableMapping): pass ... #Oh that worked, but it shouldn't. MutableMapping is a type constructor.
It means MutableMapping[Any].
#Let's make one >>> MM() Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/home/mark/repositories/typehinting/prototyping/typing.py", line 1095, in __new__ if _gorg(c) is Generic: File "/home/mark/repositories/typehinting/prototyping/typing.py", line 887, in _gorg while a.__origin__ is not None: AttributeError: type object 'Sized' has no attribute '__origin__'
# ???
Sorry, that's a bug I introduced in literally the last change to typing.py. I will fix it. The expected behavior is
TypeError: Can't instantiate abstract class MM with abstract methods __len__
#Well let's try using type variables. class MM2(tp.MutableMapping[K, V]): pass ... >>> MM2() Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/home/mark/repositories/typehinting/prototyping/typing.py", line 1095, in __new__ if _gorg(c) is Generic: File "/home/mark/repositories/typehinting/prototyping/typing.py", line 887, in _gorg while a.__origin__ is not None: AttributeError: type object 'Sized' has no attribute '__origin__'
Ditto, and sorry.
No need to apologise, I'm just a bit worried about how easy it was for me to expose this sort of bug.
At this point, we have to resort to using 'Dict', which forces us to subclass 'dict' which may not be what we want as it may cause metaclass conflicts.
3. Memory consumption is also a worry. There is no caching, which means every time I use "List[int]" as an annotation, a new class object is created. Each class may only be a few KB, but collectively this could easily add up to several MBs. This should be easy to fix.
I can work on this after the beta-1 release. Until then, type aliases can be used to avoid redundant type creation (and often they are clearer anyway :-).
Sure.
4. PY2, etc. really need to go. Assuming that this code type checks OK:
if typing.PY2: type_safe_under_py2_only() else: type_safe_under_py3_only()
Is the checker supposed to pass this:
if sys.hexversion < 0x03000000: type_safe_under_py2_only() else: type_safe_under_py3_only()
If it should pass, then why have PY2, etc. at all. If it should fail, well that is just stupid and annoying.
Pylint already understands version checks, as does our (Semmle's) checker. I suspect most IDEs do as well.
I have to negotiate this with Jukka but I think he'll agree.
5. Removing isinstance() support:
As I said before, this is the job of a checker not typing.py.
It also introduces some strange situations: D = tp.Dict[str,int] d = {} assert isinstance(d, D) d["x"] = None assert isinstance(d, D)
In the above case the first check passes, and the second fails. But d is either of type D or it isn't. It can't be both, as types are static properties of programs, unlike classes.
Well, isinstance() is a dynamic function. The type checker has no authority over its behavior beyond its signature.
And it's broken anyway: >>> D = tp.Dict[str,'D'] >>> d = {"x": {}} >>> isinstance(d, D) False
That's because _ForwardRef doesn't implement __instancheck__ or __subclasscheck__. It's easily fixed.
Realistically, I don't see typing.py being ready in time for 3.5. I'd be happy to be proved wrong.
Cheers, Mark.
P.S. I am worried by the lack of formal specification. It all seems a bit hand-waving. A formal spec reduces the likelihood of some unforeseen corner case being a permanent wart.
Formal specs are not my cup of tea. :-( (I'm not proud of this, but it just is a fact -- see how terrible a job I've done of the Python reference manual.) The best I could come up with is PEP 483.
Take the recursive type above. There is no mention of recursive types in the PEP and they are clearly possible. Are they allowed?
They should be allowed. I imagine you could create one for which a naive isinstance() imeplementation ends up in an infinite loop. That can be fixed too (we fixed this for printing self-referential lists and dicts).
I'm guessing that Jukka's thesis should cover a lot of this. Has it been published yet?
Hopefully Jukka can answer that. :-)
-- --Guido van Rossum (python.org/~guido http://python.org/~guido)
Things are looking up. I think we're down to a very small number of issues where we still disagree -- hopefully you'll allow me some leeway. :-) On Thu, May 21, 2015 at 8:45 AM, Mark Shannon wrote:
On 21/05/15 16:01, Guido van Rossum wrote:
Hi Mark,
We're down to the last few items here. I'm CC'ing python-dev so folks can see how close we are. I'll answer point by point.
On Thu, May 21, 2015 at 6:24 AM, Mark Shannon mailto:mark@hotpy.org> wrote:
Hi,
The PEP itself is looking fairly good.
I hope you'll accept it at least provisionally so we can iterate over the finer points while a prototype of typing.py in in beta 1.
However, I don't think that typing.py is ready yet, for a number of reasons:
1. As I've said before, there needs to be a distinction between classes and types. They is no need for Any, Generic, Generic's subtypes, or Union to subclass builtins.type.
I strongly disagree. They can appear in many positions where real classes are acceptable, in particular annotations can have classes (e.g. int) or types (e.g. Union[int, str]).
Why does this mean that they have to be classes? Annotations can be any object.
I want to encourage users to think about annotations as types, and for most users the distinction between type and class is too subtle, so a simpler rule is to say they are classes. This works out nicely when the annotations are simple types such as 'int' or 'str' or user-defined classes (e.g. 'Employee').
It might to help to think, not in terms of types being classes, but classes being shorthand for the nominal type for that class (from the point of view of the checker and type geeks) So when the checker sees 'int' it treats it as Type(int).
I'm fine with that being the formal interpretation (except that I don't want to introduce a function named Type()). But it's too subtle for most users.
Subtyping is distinct from subclassing; Type(int) <: Union[Type(int), Type(str)] has no parallel in subclassing. There is no class that corresponds to a Union, Any or a Generic.
Again, for most people te distinction is too subtle. People expect to be able to play around with things interactively. I think it will be helpful if they can experiment with the objects exported by typing too: experimenting with things like isinstance(42, Union[int, str]) or issubclass(Any, Employee) and issubclass(Employee, Any) is a useful thing to explore how these things work (always with the caveat that when Any is involved, issubclass is not transitive). Of course it won't work when they advance to type variables -- at that point you just *have* to understand the theory and switch from using the interactive interpreter to writing small test programs and seeing how mypy (or some other checker) responds to them.
In order to support the class C(ParameterType[T]): pass
I presume you mean class C(Generic[T])?
syntax, parametric types do indeed need to be classes, but Python has multiple inheritance, so thats not a problem: class ParameterType(type, Type): ... Otherwise typing.Types shouldn't be builtin.types and vice versa.
There's one thing here that Jukka has convinced me of. While I really want Union[...] to act like a class (though not subclassable!), plain Union (without the [...]) needn't. The same is true for Callable and Tuple without [...]. I've filed https://github.com/ambv/typehinting/issues/133 for this. I'm not sure how much work it will be to fix this but I don't think it absolutely needs to be done in beta 1 -- there's not much you can do with them anyway.
I think a lot of this issues on the tracker would not have been issues had the distinction been more clearly enforced.
Playing around with typing.py, it has also become clear to me that it is also important to distinguish type constructors from types.
What do I mean by a type constructor? A type constructor makes types. "List" is an example of a type constructor. It constructs types such as List[T] and List[int]. Saying that something is a List (as opposed to a list) should be rejected.
The PEP actually says that plain List (etc.) is equivalent to List[Any]. (Well, at least that's the intention; it's implied by the section about the equivalence between Node() and Node[Any]().
Perhaps we should change that. Using 'List', rather than 'list' or 'List[Any]' suggests an error, or misunderstanding, to me.
Is there a use case where 'List' is needed, and 'list' will not suffice? I'm assuming that the type checker knows that 'list' is a MutableSequence.
I think it's easier if we ask people to always write 'List' rather than 'list' when they are talking about types, and 'List[Any]' will probably be a popular type (lots of people don't want to think about exactly what the item type is, but they are sure that the container is a list). There's also an argument from consistency with the collection ABCs. As you know, typing defines a bunch of types that act as "stand ins" for the corresponding ABCs defined in collections.abc (Iterable, Sequence, Sized, etc.). The intention here is that anywhere one of the collection ABCs is valid it should be okay to use the corresponding class imported from typing -- so that if you have code that currently uses "from collections.abc import Sequence, Mapping" you can just replace that with "from typing import Sequence, Mapping" and your code will still work. (You can then iterate at leisure on parametrizing the types.) So we can use e.g. Sequence as a base class and it means the same as Sequence[Any]. Given this rule, it would be somewhat surprising if you couldn't use List but were forced to write List[Any] in other places. (Neither Sequence[Any] nor List[Any] can be instantiated so that's not a concern.)
2. Usability of typing as it stands:
Let's try to make a class that implements a mutable mapping.
>>> import typing as tp #Make some variables. >>> T = tp.TypeVar('T') >>> K = tp.TypeVar('K') >>> V = tp.TypeVar('V')
#Then make our class:
>>> class MM(tp.MutableMapping): pass ... #Oh that worked, but it shouldn't. MutableMapping is a type constructor.
It means MutableMapping[Any].
#Let's make one >>> MM() Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/home/mark/repositories/typehinting/prototyping/typing.py", line 1095, in __new__ if _gorg(c) is Generic: File "/home/mark/repositories/typehinting/prototyping/typing.py", line 887, in _gorg while a.__origin__ is not None: AttributeError: type object 'Sized' has no attribute '__origin__'
# ???
Sorry, that's a bug I introduced in literally the last change to typing.py. I will fix it. The expected behavior is
TypeError: Can't instantiate abstract class MM with abstract methods __len__
#Well let's try using type variables. class MM2(tp.MutableMapping[K, V]): pass ... >>> MM2() Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/home/mark/repositories/typehinting/prototyping/typing.py", line 1095, in __new__ if _gorg(c) is Generic: File "/home/mark/repositories/typehinting/prototyping/typing.py", line 887, in _gorg while a.__origin__ is not None: AttributeError: type object 'Sized' has no attribute '__origin__'
Ditto, and sorry.
No need to apologise, I'm just a bit worried about how easy it was for me to expose this sort of bug.
Well, I'm just glad you exposed it so soon after I introduced it. :-)
At this point, we have to resort to using 'Dict', which forces us to subclass 'dict' which may not be what we want as it may cause metaclass conflicts.
3. Memory consumption is also a worry. There is no caching, which means every time I use "List[int]" as an annotation, a new class object is created. Each class may only be a few KB, but collectively this could easily add up to several MBs. This should be easy to fix.
I can work on this after the beta-1 release. Until then, type aliases can be used to avoid redundant type creation (and often they are clearer anyway :-).
Sure.
OK. Tracking this in https://github.com/ambv/typehinting/issues/130 [...] -- --Guido van Rossum (python.org/~guido)
At Thu May 21 22:27:50 CEST 2015, Guido wrote:
I want to encourage users to think about annotations as types, and for most users the distinction between type and class is too subtle,
So what is the distinction that you are trying to make? That a type refers to a variable (name), and a class refers to a piece of data (object) that might be bound to that name? Whatever the intended distinction is, please be explicit in the PEP, even if you decide to paper it over in normal code. For example, the above distinction would help to explain why the typing types can't be directly instantiated, since they aren't meant to refer to specific data. (They can still be used as superclasses because practicality beats purity, and using them as a marker base class is practical.) -jJ -- If there are still threading problems with my replies, please email me with details, so that I can try to resolve them. -jJ
On Fri, May 22, 2015 at 10:23 AM, Jim J. Jewett
At Thu May 21 22:27:50 CEST 2015, Guido wrote:
I want to encourage users to think about annotations as types, and for most users the distinction between type and class is too subtle,
So what is the distinction that you are trying to make?
That a type refers to a variable (name), and a class refers to a piece of data (object) that might be bound to that name?
Sort of. But really a type is something in the mind of the type checker (or the programmer) while the class is a concept that can be inspected at runtime.
Whatever the intended distinction is, please be explicit in the PEP, even if you decide to paper it over in normal code. For example, the above distinction would help to explain why the typing types can't be directly instantiated, since they aren't meant to refer to specific data. (They can still be used as superclasses because practicality beats purity, and using them as a marker base class is practical.)
There will have to be documentation and tutorials beyond the PEP. The PEP mostly defines a standard to be used by people implementing type checkers. -- --Guido van Rossum (python.org/~guido)
Mark Shannon wrote:
PY2, etc. really need to go. Assuming that this code type checks OK:
if typing.PY2: type_safe_under_py2_only() else: type_safe_under_py3_only()
Is the checker supposed to pass this:
if sys.hexversion < 0x03000000: type_safe_under_py2_only() else: type_safe_under_py3_only()
If it should pass, then why have PY2, etc. at all.
My immediate response was that there really is a difference, when doing the equivalent of cross-compilation. It would help to make this explicit in the PEP. But ...
If it should fail, well that is just stupid and annoying.
so I'm not sure regular authors (as opposed to typing tools) would ever have reason to use it, and making stub files more different from regular python creates an attractive nuisance bigger than the clarification. So in the end, I believe PY2 should merely be part of the calling convention for type tools, and that may not be worth standardizing yet. It *is* worth explaining why they were taken out, though. And it is worth saying explicitly that typing tools should override the sys module when checking for non-native environments. -jJ -- If there are still threading problems with my replies, please email me with details, so that I can try to resolve them. -jJ
On Fri, May 22, 2015 at 9:45 AM, Jim J. Jewett
Mark Shannon wrote:
PY2, etc. really need to go. Assuming that this code type checks OK:
if typing.PY2: type_safe_under_py2_only() else: type_safe_under_py3_only()
Is the checker supposed to pass this:
if sys.hexversion < 0x03000000: type_safe_under_py2_only() else: type_safe_under_py3_only()
If it should pass, then why have PY2, etc. at all.
My immediate response was that there really is a difference, when doing the equivalent of cross-compilation. It would help to make this explicit in the PEP.
That seems obvious. There's no reason why a type checker should care about what sys.*version* is in the process that runs the type checker (that process may not even be a Python interpreter).
But ...
If it should fail, well that is just stupid and annoying.
so I'm not sure regular authors (as opposed to typing tools) would ever have reason to use it, and making stub files more different from regular python creates an attractive nuisance bigger than the clarification.
So in the end, I believe PY2 should merely be part of the calling convention for type tools, and that may not be worth standardizing yet. It *is* worth explaining why they were taken out, though.
Because there is no advantage (either to the user or to the type checker) of using e.g. typing.WINDOWS instead of using sys.platform == "win32".
And it is worth saying explicitly that typing tools should override the sys module when checking for non-native environments.
OK, I am saying it here. People writing type checkers can decide for themselves what they want to support. (It is already the case that mypy can check code for conformance with various Python versions, but mypy itself must always run in Python 3.4 or later.) -- --Guido van Rossum (python.org/~guido)
participants (3)
-
Guido van Rossum
-
Jim J. Jewett
-
Mark Shannon