Implementation work for PEP 585 (Type Hinting Generics In Standard Collections)
I'm beginning implementation work for PEP 585. I've got a little pure Python class that I'll use to guide me. (See attached proxy.py.) Then I'll start the work, roughly like this: - Create a cpython fork (gvanrossum/pep585 where to do the work). - Write a simple _Py_Class_GetItem(origin, parameters) function that just returns origin (after INCREF). - Add that function to list, tuple and dict to show it works. - Write some tests. - Write a simple Proxy object to hold origin and parameters and return it from _Py_Class_GetItem. - Update tests. - Add __repr__ implementation and getattr forwarding functionality. - Update tests. - Deploy more widely to other builtin classes (set, frozenset, type). - Update tests. - Add specific methods that are supposed to fail (__class_getitem__, __instancecheck__, __subclasscheck__). - Update tests. At this point we're ready to start using the proxy class in typing.py, instead of the current implementation of _GenericAlias. We'll need to debate whether all the error checks we currently have in typing.py we *really* need to keep (usually these will also be detected by running a static type checker, so there's little harm in allowing nonsense like `list[1, 2, 3]`). Alternatively I'll need volunteers to implement those checks. I expect the implementation work to result in proposed improvements to PEP 585. (I already have some -- I don't think `isinstance(x, list[int])` should work, and I don't think we should strive for `list[int] == list`.) -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
1. What is type(list[str]())? list or list[str]? 2. Would it work with list subclasses? I.e. class L(list): pass x: L[str] 3. Don't forget about __reduce__(). 4. Will be a cache for parametrized generics?
On Mon, Jan 27, 2020 at 9:08 AM Serhiy Storchaka <storchaka@gmail.com> wrote:
1. What is type(list[str]())? list or list[str]?
PEP 585 says it should be list. Type erasure happens at object instantiation.
2. Would it work with list subclasses? I.e.
class L(list): pass
x: L[str]
I think this will work without any extra code. But I'll make sure to test it.
3. Don't forget about __reduce__().
Good point, I'll add it to the list of tasks. 4. Will be a cache for parametrized generics?
Oh, we probably need that (it proved essential for typing.py's List[int] etc.). I'll put it in the plan. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
Hi Guido, I actually was playing around with how to implement this yesterday, and I put together the attached as an implementation. I realized that simply passing attributes through via `__getattr__` likely won't work, as class methods and variables won't be passed through correctly, so in you example, I believe dict_str_pair.from_keys(range(10)) will fail. That is why I created an ad-hoc subclass of the subscripted type, so that it will inherit all of that, and we can just override what we need. Perhaps there is a simpler solution to solve this problem that I am unaware of. I copied your repr since it was better than the one I threw together :) I didn't do much error checking in __new__/__init__ because I mostly wanted to find a system that seemed robust enough and having something to show. Best, Ethan On Mon, Jan 27, 2020 at 8:27 AM Guido van Rossum <guido@python.org> wrote:
I'm beginning implementation work for PEP 585. I've got a little pure Python class that I'll use to guide me. (See attached proxy.py.) Then I'll start the work, roughly like this:
- Create a cpython fork (gvanrossum/pep585 where to do the work). - Write a simple _Py_Class_GetItem(origin, parameters) function that just returns origin (after INCREF). - Add that function to list, tuple and dict to show it works. - Write some tests. - Write a simple Proxy object to hold origin and parameters and return it from _Py_Class_GetItem. - Update tests. - Add __repr__ implementation and getattr forwarding functionality. - Update tests. - Deploy more widely to other builtin classes (set, frozenset, type). - Update tests. - Add specific methods that are supposed to fail (__class_getitem__, __instancecheck__, __subclasscheck__). - Update tests.
At this point we're ready to start using the proxy class in typing.py, instead of the current implementation of _GenericAlias. We'll need to debate whether all the error checks we currently have in typing.py we *really* need to keep (usually these will also be detected by running a static type checker, so there's little harm in allowing nonsense like `list[1, 2, 3]`). Alternatively I'll need volunteers to implement those checks.
I expect the implementation work to result in proposed improvements to PEP 585. (I already have some -- I don't think `isinstance(x, list[int])` should work, and I don't think we should strive for `list[int] == list`.)
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/> _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/
Okay -- I have now moved to a prototype written in C, which you can see emerge here: https://github.com/gvanrossum/cpython/tree/pep585 I don't think there's much more that we can learn from Python prototypes at this point -- so many details are different. On Mon, Jan 27, 2020 at 2:37 PM Ethan Smith <ethan@ethanhs.me> wrote:
Hi Guido,
I actually was playing around with how to implement this yesterday, and I put together the attached as an implementation. I realized that simply passing attributes through via `__getattr__` likely won't work, as class methods and variables won't be passed through correctly, so in you example, I believe dict_str_pair.from_keys(range(10)) will fail. That is why I created an ad-hoc subclass of the subscripted type, so that it will inherit all of that, and we can just override what we need. Perhaps there is a simpler solution to solve this problem that I am unaware of. I copied your repr since it was better than the one I threw together :)
I didn't do much error checking in __new__/__init__ because I mostly wanted to find a system that seemed robust enough and having something to show.
Best, Ethan
On Mon, Jan 27, 2020 at 8:27 AM Guido van Rossum <guido@python.org> wrote:
I'm beginning implementation work for PEP 585. I've got a little pure Python class that I'll use to guide me. (See attached proxy.py.) Then I'll start the work, roughly like this:
- Create a cpython fork (gvanrossum/pep585 where to do the work). - Write a simple _Py_Class_GetItem(origin, parameters) function that just returns origin (after INCREF). - Add that function to list, tuple and dict to show it works. - Write some tests. - Write a simple Proxy object to hold origin and parameters and return it from _Py_Class_GetItem. - Update tests. - Add __repr__ implementation and getattr forwarding functionality. - Update tests. - Deploy more widely to other builtin classes (set, frozenset, type). - Update tests. - Add specific methods that are supposed to fail (__class_getitem__, __instancecheck__, __subclasscheck__). - Update tests.
At this point we're ready to start using the proxy class in typing.py, instead of the current implementation of _GenericAlias. We'll need to debate whether all the error checks we currently have in typing.py we *really* need to keep (usually these will also be detected by running a static type checker, so there's little harm in allowing nonsense like `list[1, 2, 3]`). Alternatively I'll need volunteers to implement those checks.
I expect the implementation work to result in proposed improvements to PEP 585. (I already have some -- I don't think `isinstance(x, list[int])` should work, and I don't think we should strive for `list[int] == list`.)
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/> _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
On Mon, Jan 27, 2020 at 4:32 PM I wrote:
Okay -- I have now moved to a prototype written in C, which you can see emerge here:
https://github.com/gvanrossum/cpython/tree/pep585
I don't think there's much more that we can learn from Python prototypes at this point -- so many details are different.
I've now got this branch to the point where it defines a GenericAlias type object in C that does (mostly?) correct pass-through and allows subclassing. I need a breather so if you want to contribute a C implementation of my __repr__ you're welcome to give it a try. This also provides a (rare!) example of a situation where type(t) and t.__class__ are legitimately different!
t = list[int] type(t) <class 'GenericAlias'> t.__class__ <class 'type'>
On Mon, Jan 27, 2020 at 2:37 PM Ethan Smith <ethan@ethanhs.me> wrote:
Hi Guido,
I actually was playing around with how to implement this yesterday, and I put together the attached as an implementation. I realized that simply passing attributes through via `__getattr__` likely won't work, as class methods and variables won't be passed through correctly, so in you example, I believe dict_str_pair.from_keys(range(10)) will fail.
Oh, this fails in my C implementation:
t = dict[int, str] t.fromkeys(range(10)) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: descriptor 'fromkeys' for type 'dict' needs a type, not a 'range' as arg 2 dict.fromkeys(range(10)) {0: None, 1: None, 2: None, 3: None, 4: None, 5: None, 6: None, 7: None, 8: None, 9: None}
But I don't think your solution would help:
That is why I created an ad-hoc subclass of the subscripted type, so that
it will inherit all of that, and we can just override what we need. Perhaps there is a simpler solution to solve this problem that I am unaware of.
t = list[int] # That is your clever subclass of list t
a = t() a [] a.__class__ # Expect list, got list[int]
I don't know, but I don't think your solution is right: it doesn't erase the generics for the created instance. PEP 585 specifies that the *class* produced by list[int] has to know its __origin__ and __parameters__, but it also specifies that the *instance* produced by calling that -- i.e., a = list[int]() -- should *not* know that it was created from a parameterized class -- its type() and .__class__ should just be list. But in your case: $ python3 -i pep585.py ... list[int] list[int]
type(a) # Expect list, got error Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: type.__new__() takes exactly 3 arguments (1 given)
I copied your repr since it was better than the one I threw together :)
Thanks -- now I need to rewrite it in C. :-) Or if you want to contribute a C implementation you're welcome to give it a try. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
Well, I fixed the problem with class methods in one line. :-) https://github.com/gvanrossum/cpython/commit/ce78557085b0755ae3e4b1bf6080930... On Mon, Jan 27, 2020 at 5:48 PM Guido van Rossum <guido@python.org> wrote:
On Mon, Jan 27, 2020 at 4:32 PM I wrote:
Okay -- I have now moved to a prototype written in C, which you can see emerge here:
https://github.com/gvanrossum/cpython/tree/pep585
I don't think there's much more that we can learn from Python prototypes at this point -- so many details are different.
I've now got this branch to the point where it defines a GenericAlias type object in C that does (mostly?) correct pass-through and allows subclassing.
I need a breather so if you want to contribute a C implementation of my __repr__ you're welcome to give it a try.
This also provides a (rare!) example of a situation where type(t) and t.__class__ are legitimately different!
t = list[int] type(t) <class 'GenericAlias'> t.__class__ <class 'type'>
On Mon, Jan 27, 2020 at 2:37 PM Ethan Smith <ethan@ethanhs.me> wrote:
Hi Guido,
I actually was playing around with how to implement this yesterday, and I put together the attached as an implementation. I realized that simply passing attributes through via `__getattr__` likely won't work, as class methods and variables won't be passed through correctly, so in you example, I believe dict_str_pair.from_keys(range(10)) will fail.
Oh, this fails in my C implementation:
t = dict[int, str] t.fromkeys(range(10)) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: descriptor 'fromkeys' for type 'dict' needs a type, not a 'range' as arg 2 dict.fromkeys(range(10)) {0: None, 1: None, 2: None, 3: None, 4: None, 5: None, 6: None, 7: None, 8: None, 9: None}
But I don't think your solution would help:
That is why I created an ad-hoc subclass of the subscripted type, so that
it will inherit all of that, and we can just override what we need. Perhaps there is a simpler solution to solve this problem that I am unaware of.
I don't know, but I don't think your solution is right: it doesn't erase the generics for the created instance. PEP 585 specifies that the *class* produced by list[int] has to know its __origin__ and __parameters__, but it also specifies that the *instance* produced by calling that -- i.e., a = list[int]() -- should *not* know that it was created from a parameterized class -- its type() and .__class__ should just be list. But in your case:
t = list[int] # That is your clever subclass of list t
a = t() a [] a.__class__ # Expect list, got list[int]
$ python3 -i pep585.py ... list[int] list[int]
type(a) # Expect list, got error Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: type.__new__() takes exactly 3 arguments (1 given)
I copied your repr since it was better than the one I threw together :)
Thanks -- now I need to rewrite it in C. :-) Or if you want to contribute a C implementation you're welcome to give it a try.
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
Nice to see this work! One thing that would be nice would be able to get access to the `Proxy` instance in classmethods, so that `cls` will be that instance instead of the `__origin__` class. Why? This allows `MyCustomType[int].create(...)` to have different runtime behavior than `MyCustomType[str].create(...)`. Currently, to do this, I have to monkeypatch `_GenericAlias.__getattr__`: ``` def generic_getattr(self, attr): """ Allows classmethods to get generic types by checking if we are getting a descriptor type and if we are, we pass in the generic type as the class instead of the origin type. Modified from https://github.com/python/cpython/blob/aa73841a8fdded4a462d045d1eb03899cbeec... """ if "__origin__" in self.__dict__ and attr not in ( "__wrapped__", "__union_params__", ): # If the attribute is a descriptor, pass in the generic class try: descr = self.__origin__.__getattribute__(self.__origin__, attr) except Exception: return if hasattr(descr, "__get__"): return descr.__get__(None, self) # Otherwise, just resolve it normally return getattr(self.__origin__, attr) raise AttributeError(attr) typing._GenericAlias.__getattr__ = generic_getattr # type: ignore ```
Ah, I seem to have not added a name to my account, it should be fixed now. I don't see how you would be able to do the same work-around though for `__init__`, to allow the `MyCustomClass[int]()` to be differentiated at runtime from `MyCustomClass[str]()` Was it discussed somewhere why generic type are not implemented as metaclasses? It seems like if they were, then you would be able to do this, by simply taking `type(self)` to give you back the generic type. Also, from a conceptual perspective, if I squint my eyes, generic classes seem like they should be metaclasses, since they are basically functions that return classes, just with a different syntax than normal function calls for stylistic purposes.
IMHO there are important differences: A generic class is a "constructor" of a class given a complete class, to be instantiated at some later point; also, in Python the type parameter cannot be directly inspected; A metaclass is a "constructor" of a class given a set of definitions, to be instantiated immediately - inspection is allowed and usually necessary. - Elazar On Tue, 28 Jan 2020 at 01:05, Saul Shanabrook via Typing-sig < typing-sig@python.org> wrote:
Ah, I seem to have not added a name to my account, it should be fixed now.
I don't see how you would be able to do the same work-around though for `__init__`, to allow the `MyCustomClass[int]()` to be differentiated at runtime from `MyCustomClass[str]()`
Was it discussed somewhere why generic type are not implemented as metaclasses? It seems like if they were, then you would be able to do this, by simply taking `type(self)` to give you back the generic type. Also, from a conceptual perspective, if I squint my eyes, generic classes seem like they should be metaclasses, since they are basically functions that return classes, just with a different syntax than normal function calls for stylistic purposes. _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/
Are you saying that generic classes are are able to see the type parameters passed in at runtime? If so, I agree that in the current implementation they cannot, but that it would be useful if they could and it is unclear why you wouldn't want to allow this.
On Mon, Jan 27, 2020 at 3:05 PM Saul Shanabrook via Typing-sig < typing-sig@python.org> wrote:
Ah, I seem to have not added a name to my account, it should be fixed now.
Your email is still weird (b9f0h8l2g1i9f8g9@quansight.slack.com).
I don't see how you would be able to do the same work-around though for `__init__`, to allow the `MyCustomClass[int]()` to be differentiated at runtime from `MyCustomClass[str]()`
This (what you wrote in your first post) is an interesting feature request, but I'm not sure how to implement it in C. Also there's the issue that PEP 585 argues (IMO convincingly) that for instances the generic parameters are erased, so that list[int]() returns a plain list instance, indistinguishable from list(). Now, class methods can be invoked from instances too -- if I write {}.fromkeys(range(10)) that's the same as dict.fromkeys(range(10)). But in the first call the __origin__ and __parameters__ attributes are unavailable (due to the type erasure). And I don't think we should reconsider the type erasure.
Was it discussed somewhere why generic type are not implemented as metaclasses? It seems like if they were, then you would be able to do this, by simply taking `type(self)` to give you back the generic type. Also, from a conceptual perspective, if I squint my eyes, generic classes seem like they should be metaclasses, since they are basically functions that return classes, just with a different syntax than normal function calls for stylistic purposes.
This was answered by Elazar already, and to reiterate, if list isn't a metaclass, then list[int] shouldn't be either -- instantiating either returns a list object. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
Hey Guido, I might have been misunderstanding this change. This `Proxy` class you are making, is that used only for built in collection or for all generic classes? For example, I would like to know the type parameters pass in at runtime to my custom generic class: ``` class Maybe(typing.Generic[T]): @classmethod def nothing(cls) -> Maybe[T]: # Should be able to tell in here if `cls` had generic parameters added ... @classmethod def just(cls, value: T) -> Maybe[T]: ... # So that we can create different instances based on type assert Maybe[int].nothing() != Maybe[str].nothing() ``` Another use case here is for collections that do require knowing the inner type value at runtime, like NumPy's arrays. Although NumPy might not change it's syntax, it would be nice for more modern array libraries to be able to specify the inner types as type parameters, instead of as arguments, so they can be checked statically as well as used at runtime. So something like this `numpy.ndarray[numpy.float64].arange(10)`. Ideally, we would also be able to do the same in the constructor, to allow something like `numpy.ndarray[numpy.float64](1, 2, 3)` to create an array of floats.
On Tue, Jan 28, 2020 at 9:12 AM Saul Shanabrook via Typing-sig < typing-sig@python.org> wrote:
I might have been misunderstanding this change. This `Proxy` class you are making, is that used only for built in collection or for all generic classes?
That's still open. PEP 585 hasn't been sufficiently discussed. But I note that none of your examples work with the current typing.py module, so in a sense we're not breaking anything if we were to use the much simpler class I'm implementing in C (tentative name GenericAlias -- though PEP 585 doesn't name it). Also note that the Python code I attached earlier is irrelevant at this point.
For example, I would like to know the type parameters pass in at runtime to my custom generic class:
``` class Maybe(typing.Generic[T]): @classmethod def nothing(cls) -> Maybe[T]: # Should be able to tell in here if `cls` had generic parameters added ...
@classmethod def just(cls, value: T) -> Maybe[T]: ...
# So that we can create different instances based on type assert Maybe[int].nothing() != Maybe[str].nothing() ```
Another use case here is for collections that do require knowing the inner type value at runtime, like NumPy's arrays. Although NumPy might not change it's syntax, it would be nice for more modern array libraries to be able to specify the inner types as type parameters, instead of as arguments, so they can be checked statically as well as used at runtime. So something like this `numpy.ndarray[numpy.float64].arange(10)`. Ideally, we would also be able to do the same in the constructor, to allow something like `numpy.ndarray[numpy.float64](1, 2, 3)` to create an array of floats.
This should be a separate discussion, since you are very much proposing a new feature. It will require a separate PEP. I do appreciate the desire! We just need to get other people to agree that it's desirable *and* can be implemented reasonably. Then we can work on a PEP and reference implementation. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
participants (6)
-
b9f0h8l2g1i9f8g9@quansight.slack.com
-
Elazar
-
Ethan Smith
-
Guido van Rossum
-
Saul Shanabrook
-
Serhiy Storchaka