Review of PEP 646 (Variadic Generics)
I have read the proposed PEP about variadic generics (PEP 646) and I like it enough that I want to sponsor it and want to help getting it over the finish line (we have to get the Steering Council to understand enough of it that they'll delegate approval to me :-). For reference, here's the PR that proposes to add PEP 646: https://github.com/python/peps/pull/1740 And here's the original Google Doc: https://docs.google.com/document/d/1oXWyAtnv0-pbyJud8H5wkpIk8aajbkX-leJ8JXsE... A good review starts by briefly summarizing the proposal being reviewed, so here's the proposal in my own words. **Motivation A:** We want to create generic types that take an arbitrary number of type parameters, like Tuple. For example, Tensors where each dimension is a "type". There is a demonstration of this without variadics, but it requires defining types `Tensor1[T1]`, `Tensor2[T1, T2]`, etc.: https://github.com/deepmind/tensor_annotations. We want just `Tensor[T1]`, `Tensor[T1, T2]`, etc., for any number of parameters. **Motivation B:** The type of functions like map() and zip() cannot be expressed using the existing type system. The simplest example would be the type of ``` def foo(*args): return args a = foo(42, "abc") # Should have type Tuple[int, str] ``` **Proposal:** Introduce a new kind of type variable that can be instantiated with an arbitrary number of types, some new syntax, and a new type operator: ``` Ts = TypeVarTuple("Ts") # NEW T = TypeVar("T") def f(*args: *Ts) -> Tuple[*Ts]: ... class C(Generic[*Ts]): ... Callable[[*Ts], T] Tuple[*Ts] Map[SomeType, Ts] # SomeType is a generic of one parameter ``` In most cases the form `*Ts` may be preceded and/or followed by any number of non-variadic types, e.g., `Tuple[int, int, *Ts, str]`. In cases where it's unambiguous, multiple variadic type variables are also allowed, e.g., `Tuple[*Ts1, *Ts2]`. For older Python versions, `Expand[Ts]` would mean the same as `*Ts`. So now let me go on with my (generally favorable) review. (I left many detailed editorial comments in the Google Doc -- I will not repeat those here.) I like the proposal a lot, and I am glad that we now have (apparently) a working prototype in Pyre. This has been on our wish list since at least 2016 -- much early discussion happened in https://github.com/python/typing/issues/193 and at various meetings at PyCon and at the Bay Area Typing Meetups (links in the PEP). The proposed syntax has cycled through endless variations, and I am fine with the current proposal, even though it is still slightly clunky. There's also https://github.com/python/typing/issues/513, which is specifically about array types. There are probably other motivating applications that the PEP doesn't mention, for example certain decorator types (I doubt that all of these are taken care by PEP 612, ParamSpec). I wonder why the proposal left out `Union[*Ts]`. This would seem useful, e.g. to type this function: ``` def f(*args): return random.choice(args) ``` which could be typed naturally as follows: ``` def f(*args: *Ts) -> Union[*Ts]: return random.choice(args) ``` I'm not sure that `Tensor[T1, T2, ...]` is the be-all and end-all of tensor types (e.g. where would you put the data type of numpy arrays?) but maybe that can be handled by just adding one non-variadic type variable (a complete example would be nice though). There are also proposals for integer generics, which deserve their own PEP (presumably aiming at Python 3.11). Eric Traut proposed an extension that would allow defining variadic subtypes of Sequence which behave similar to Tuple (where `Tuple[int, int]` is a subtype of `Tuple[int, ...]` which is a subtype of `Sequence[int]`), but I'm not sure we would need that a lot -- we could always add that later. The introduction of a prefix `*` operator requires new syntax in a few cases. While `Callable[[*Ts], T]` is already valid (the parser interprets this as sequence unpacking), `Tuple[*Ts]` is not, and neither is `def f(*a: *Ts)`. For `Tuple[*Ts]` we can piggy-back on PEP 637 (keyword indexing, which adds this as well), but for the `def` example we'll need to add something new specifically for this PEP. I think that's fine -- we can give it runtime semantics that turns `*Ts` into `(*Ts,)`, which is similar to the other places: at runtime it iterates over the argument, producing a tuple. In all cases we need to support `Expand[Ts]` as well for backwards compatibility with Python 3.9 and before. The `Map[]` operator is, as I said, fairly clunky. In the past various other syntaxes have been proposed. In particular, @sixolet p <https://github.com/python/typing/issues/513>roposed a syntax that would allow defining `zip()` as follows: ``` def zip(*args: Iterable[Ts]) -> Iterator[Tuple[Ts, ...]]: ... ``` Compare this to what it would look using the current proposal: ``` def zip(*args: *Map[Iterable, Ts]) -> Iterator[Ts]): ... # Note that Iterator[Ts] is the same as Iterator[Tuple[*Ts]] ``` Sixolet's syntax made the iteration over the elements of Ts implicit, which is slightly shorter, and doesn't require "higher-order type functions" (is there an official name for that?), but also slightly more cryptic, and created yet another use for the ellipsis: `Tuple[Ts, ...]` is not quite analogous to `Tuple[T, ...]`, since the latter is *homogeneous* while the former is still heterogeneous. The new notation uses an explicit `Map[]` operator, which is similar to the choice we made in PEP 612 for `Concatenate[]`. (Speaking of this choice, we could drop the `*` prefix and rely purely on `Expand[]`, but that feels unnecessarily verbose, and we'll get most of the needed syntax for free with PEP 637, assuming it's accepted.) All in all my recommendation for this PEP is: clean up the text based on the GDoc feedback, add `Union[*Ts]`, and submit to the Steering Council. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
Thank you for sponsoring this, Guido, and for the thorough review!
I wonder why the proposal left out `Union[*Ts]`.
Ah, yes, great point. I'll add a section on that.
I'm not sure that `Tensor[T1, T2, ...]` is the be-all and end-all of tensor types (e.g. where would you put the data type of numpy arrays?) but maybe that can be handled by just adding one non-variadic type variable (a complete example would be nice though).
This has been on the back of my mind too. Adding a single additional non-variadic type variable is how I was imagining it would work too, though there are still some details to work out (e.g. ideally it should be optional so that people can choose what level of type verbosity they want to go with). I'll add a section trying to figure this out. The other thing that's still unresolved is how we handle access to individual types - needed so that we can provide overloads of shape-manipulating operations. (I'm assuming that overloads are the way to go here, at least for the time being. In an ideal world we would be able to express the resulting shapes directly as a function of the arguments, but I don't think that'll be possible without fully dependent typing). My initial idea was to do this using "class overloads": ```python class Tensor(Generic[*Shape]): ... @overload class Tensor(Generic[Axis1, Axis2]): def transpose(self) -> Tensor[Axis2, Axis1]: ... @overload class Tensor(Generic[Axis1, Axis2, Axis3]): def transpose(self) -> Tensor[Axis3, Axis2, Axis1]: ... ``` But you're right in calling this out in the draft doc as non-trivial. It's also very verbose, requiring a whole separate class for each possible instantiation. Instead, perhaps the following would suffice? ```python class Tensor(Generic[*Shape]): @overload def transpose(self: Tensor[Axis1, Axis2]) -> Tensor[Axis2, Axis1]: ... @overload def transpose(self: Tensor[Axis1, Axis2, Axis3]) -> Tensor[Axis3, Axis2, Axis1]: ... ``` This is similar to the following example, which already seems to type-check properly in mypy: ```python class C(Generic[T]): @overload def f(self: C[int], x) -> int: return x @overload def f(self: C[str], x) -> str: return x ``` I'd welcome other suggestions, though! In any case, I'll continue cleaning up the doc as suggested, moving discussion of meatier issues to this thread for posterity, and post here once I think the doc is done. On Tue, 22 Dec 2020 at 23:46, Guido van Rossum <guido@python.org> wrote:
I have read the proposed PEP about variadic generics (PEP 646) and I like it enough that I want to sponsor it and want to help getting it over the finish line (we have to get the Steering Council to understand enough of it that they'll delegate approval to me :-).
For reference, here's the PR that proposes to add PEP 646: https://github.com/python/peps/pull/1740 And here's the original Google Doc: https://docs.google.com/document/d/1oXWyAtnv0-pbyJud8H5wkpIk8aajbkX-leJ8JXsE...
A good review starts by briefly summarizing the proposal being reviewed, so here's the proposal in my own words.
**Motivation A:** We want to create generic types that take an arbitrary number of type parameters, like Tuple. For example, Tensors where each dimension is a "type". There is a demonstration of this without variadics, but it requires defining types `Tensor1[T1]`, `Tensor2[T1, T2]`, etc.: https://github.com/deepmind/tensor_annotations. We want just `Tensor[T1]`, `Tensor[T1, T2]`, etc., for any number of parameters.
**Motivation B:** The type of functions like map() and zip() cannot be expressed using the existing type system. The simplest example would be the type of ``` def foo(*args): return args a = foo(42, "abc") # Should have type Tuple[int, str] ```
**Proposal:** Introduce a new kind of type variable that can be instantiated with an arbitrary number of types, some new syntax, and a new type operator: ``` Ts = TypeVarTuple("Ts") # NEW T = TypeVar("T")
def f(*args: *Ts) -> Tuple[*Ts]: ... class C(Generic[*Ts]): ... Callable[[*Ts], T] Tuple[*Ts]
Map[SomeType, Ts] # SomeType is a generic of one parameter ``` In most cases the form `*Ts` may be preceded and/or followed by any number of non-variadic types, e.g., `Tuple[int, int, *Ts, str]`. In cases where it's unambiguous, multiple variadic type variables are also allowed, e.g., `Tuple[*Ts1, *Ts2]`. For older Python versions, `Expand[Ts]` would mean the same as `*Ts`.
So now let me go on with my (generally favorable) review. (I left many detailed editorial comments in the Google Doc -- I will not repeat those here.)
I like the proposal a lot, and I am glad that we now have (apparently) a working prototype in Pyre. This has been on our wish list since at least 2016 -- much early discussion happened in https://github.com/python/typing/issues/193 and at various meetings at PyCon and at the Bay Area Typing Meetups (links in the PEP). The proposed syntax has cycled through endless variations, and I am fine with the current proposal, even though it is still slightly clunky. There's also https://github.com/python/typing/issues/513, which is specifically about array types.
There are probably other motivating applications that the PEP doesn't mention, for example certain decorator types (I doubt that all of these are taken care by PEP 612, ParamSpec).
I wonder why the proposal left out `Union[*Ts]`. This would seem useful, e.g. to type this function: ``` def f(*args): return random.choice(args) ``` which could be typed naturally as follows: ``` def f(*args: *Ts) -> Union[*Ts]: return random.choice(args) ```
I'm not sure that `Tensor[T1, T2, ...]` is the be-all and end-all of tensor types (e.g. where would you put the data type of numpy arrays?) but maybe that can be handled by just adding one non-variadic type variable (a complete example would be nice though). There are also proposals for integer generics, which deserve their own PEP (presumably aiming at Python 3.11).
Eric Traut proposed an extension that would allow defining variadic subtypes of Sequence which behave similar to Tuple (where `Tuple[int, int]` is a subtype of `Tuple[int, ...]` which is a subtype of `Sequence[int]`), but I'm not sure we would need that a lot -- we could always add that later.
The introduction of a prefix `*` operator requires new syntax in a few cases. While `Callable[[*Ts], T]` is already valid (the parser interprets this as sequence unpacking), `Tuple[*Ts]` is not, and neither is `def f(*a: *Ts)`. For `Tuple[*Ts]` we can piggy-back on PEP 637 (keyword indexing, which adds this as well), but for the `def` example we'll need to add something new specifically for this PEP. I think that's fine -- we can give it runtime semantics that turns `*Ts` into `(*Ts,)`, which is similar to the other places: at runtime it iterates over the argument, producing a tuple. In all cases we need to support `Expand[Ts]` as well for backwards compatibility with Python 3.9 and before.
The `Map[]` operator is, as I said, fairly clunky. In the past various other syntaxes have been proposed. In particular, @sixolet p <https://github.com/python/typing/issues/513>roposed a syntax that would allow defining `zip()` as follows: ``` def zip(*args: Iterable[Ts]) -> Iterator[Tuple[Ts, ...]]: ... ``` Compare this to what it would look using the current proposal: ``` def zip(*args: *Map[Iterable, Ts]) -> Iterator[Ts]): ... # Note that Iterator[Ts] is the same as Iterator[Tuple[*Ts]] ``` Sixolet's syntax made the iteration over the elements of Ts implicit, which is slightly shorter, and doesn't require "higher-order type functions" (is there an official name for that?), but also slightly more cryptic, and created yet another use for the ellipsis: `Tuple[Ts, ...]` is not quite analogous to `Tuple[T, ...]`, since the latter is *homogeneous* while the former is still heterogeneous. The new notation uses an explicit `Map[]` operator, which is similar to the choice we made in PEP 612 for `Concatenate[]`. (Speaking of this choice, we could drop the `*` prefix and rely purely on `Expand[]`, but that feels unnecessarily verbose, and we'll get most of the needed syntax for free with PEP 637, assuming it's accepted.)
All in all my recommendation for this PEP is: clean up the text based on the GDoc feedback, add `Union[*Ts]`, and submit to the Steering Council.
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/> _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: mrahtz@google.com
(Moving some discussion here from the doc) # Concatenating type variable tuples with other types
In all cases we should also support extra single types before and after, e.g. Tuple[int, *Ts, str].
In cases where there's only a single type variable tuple, it should be fine to allow an *arbitrary* number of concrete types before and after, shouldn't it? ```python def foo(t: Tuple[int, str, *Ts, double]) -> Tuple[*Ts]: ... t: Tuple[int, str, float, bool, double] foo(t) # Return has type Tuple[float, bool] ``` # Multiple type variable tuples Thinking out loud - let's get straight about all the places this could occur. ## Function arguments Example 1: ```python def func(spam: Tuple[*Ts1, *Ts2]): ... spam: Tuple[int, str, bool] func(spam) ``` This wouldn't work: how would we decide which types were bound to `Ts1` and which were bound to `Ts2`? (Ignore the fact that type variables are only used once in the signature here.) On the other hand, it *would* work if there were extra constraints - say, from other arguments whose type was unambiguous: Example 2: ```python def func(ham: Tuple[*Ts1], spam: Tuple[*Ts1, *Ts2]): ... ham: Tuple[int, float] spam: Tuple[int, float, double, str] func(ham, spam) ``` **Conclusion: sometimes alright, sometimes not.** ## Function returns Can this work? Example 3: ```python def foo() -> Tuple[*Ts1, *Ts]: return 0, 0.0, '0' ``` On the face of it, we have the same problem. But in practice, we'd never encounter this example, because `Ts1` and `Ts2` would have had to occur somewhere else in the signature, which would have nailed them down: Example 4: ```python def foo(ham: Tuple[*Ts1], spam: Tuple[*Ts2]) -> Tuple[*Ts1, *Ts2]: ... ham: Tuple[int, str] spam: Tuple[float, double] foo(ham, spam) # Inferred type is Tuple[int, str, float, double] ``` **Conclusion: always fine.** ## Classes Example 5: ```python class C(Generic[*Ts1, *Ts2]): ... c: C[int, str, float] = C() ``` Same problem as function arguments. And this time, I don't think there's any way to add extra constraints to disambiguate. **Conclusion: never alright.** If, for some reason, we did want a class that was generic in multiple type tuple variables, the current proposal in the PEP is: Example 6: ```python class C(Generic[Ts1, Ts2]): ... c: C[Tuple[int, str], Tuple[float]] = C() # Great! c: C[int, str, float] # Not allowed ``` --- OK, so the example that Pradeep suggested... ```python def partial(f: Callable[[*Ts, *Rs], T], *some_args: *Ts) -> Callable[[*Rs], T]: ... ``` ...is similar to Example 2: the ``Callable`` is ambiguous on its own, but there's extra context in the rest of the signature which disambiguates it. --- So overall, the three options I see are: * Option 1: Disallow multiple expanded type variables tuples everywhere, for consistency and ease-of-understanding * Option 2: Only allow multiple expanded type variable tuples in contexts where it's *always* unambiguous - i.e. only in return types. * Option 3: Allow multiple expanded type variable tuples in general, but have the type checker produce an error when the types cannot be solved for. Thoughts? On Wed, 23 Dec 2020 at 11:02, Matthew Rahtz <mrahtz@google.com> wrote:
Thank you for sponsoring this, Guido, and for the thorough review!
I wonder why the proposal left out `Union[*Ts]`.
Ah, yes, great point. I'll add a section on that.
I'm not sure that `Tensor[T1, T2, ...]` is the be-all and end-all of tensor types (e.g. where would you put the data type of numpy arrays?) but maybe that can be handled by just adding one non-variadic type variable (a complete example would be nice though).
This has been on the back of my mind too. Adding a single additional non-variadic type variable is how I was imagining it would work too, though there are still some details to work out (e.g. ideally it should be optional so that people can choose what level of type verbosity they want to go with). I'll add a section trying to figure this out.
The other thing that's still unresolved is how we handle access to individual types - needed so that we can provide overloads of shape-manipulating operations. (I'm assuming that overloads are the way to go here, at least for the time being. In an ideal world we would be able to express the resulting shapes directly as a function of the arguments, but I don't think that'll be possible without fully dependent typing). My initial idea was to do this using "class overloads":
```python class Tensor(Generic[*Shape]): ...
@overload class Tensor(Generic[Axis1, Axis2]): def transpose(self) -> Tensor[Axis2, Axis1]: ...
@overload class Tensor(Generic[Axis1, Axis2, Axis3]): def transpose(self) -> Tensor[Axis3, Axis2, Axis1]: ... ```
But you're right in calling this out in the draft doc as non-trivial. It's also very verbose, requiring a whole separate class for each possible instantiation.
Instead, perhaps the following would suffice?
```python class Tensor(Generic[*Shape]):
@overload def transpose(self: Tensor[Axis1, Axis2]) -> Tensor[Axis2, Axis1]: ...
@overload def transpose(self: Tensor[Axis1, Axis2, Axis3]) -> Tensor[Axis3, Axis2, Axis1]: ... ```
This is similar to the following example, which already seems to type-check properly in mypy:
```python class C(Generic[T]):
@overload def f(self: C[int], x) -> int: return x
@overload def f(self: C[str], x) -> str: return x ```
I'd welcome other suggestions, though!
In any case, I'll continue cleaning up the doc as suggested, moving discussion of meatier issues to this thread for posterity, and post here once I think the doc is done.
On Tue, 22 Dec 2020 at 23:46, Guido van Rossum <guido@python.org> wrote:
I have read the proposed PEP about variadic generics (PEP 646) and I like it enough that I want to sponsor it and want to help getting it over the finish line (we have to get the Steering Council to understand enough of it that they'll delegate approval to me :-).
For reference, here's the PR that proposes to add PEP 646: https://github.com/python/peps/pull/1740 And here's the original Google Doc: https://docs.google.com/document/d/1oXWyAtnv0-pbyJud8H5wkpIk8aajbkX-leJ8JXsE...
A good review starts by briefly summarizing the proposal being reviewed, so here's the proposal in my own words.
**Motivation A:** We want to create generic types that take an arbitrary number of type parameters, like Tuple. For example, Tensors where each dimension is a "type". There is a demonstration of this without variadics, but it requires defining types `Tensor1[T1]`, `Tensor2[T1, T2]`, etc.: https://github.com/deepmind/tensor_annotations. We want just `Tensor[T1]`, `Tensor[T1, T2]`, etc., for any number of parameters.
**Motivation B:** The type of functions like map() and zip() cannot be expressed using the existing type system. The simplest example would be the type of ``` def foo(*args): return args a = foo(42, "abc") # Should have type Tuple[int, str] ```
**Proposal:** Introduce a new kind of type variable that can be instantiated with an arbitrary number of types, some new syntax, and a new type operator: ``` Ts = TypeVarTuple("Ts") # NEW T = TypeVar("T")
def f(*args: *Ts) -> Tuple[*Ts]: ... class C(Generic[*Ts]): ... Callable[[*Ts], T] Tuple[*Ts]
Map[SomeType, Ts] # SomeType is a generic of one parameter ``` In most cases the form `*Ts` may be preceded and/or followed by any number of non-variadic types, e.g., `Tuple[int, int, *Ts, str]`. In cases where it's unambiguous, multiple variadic type variables are also allowed, e.g., `Tuple[*Ts1, *Ts2]`. For older Python versions, `Expand[Ts]` would mean the same as `*Ts`.
So now let me go on with my (generally favorable) review. (I left many detailed editorial comments in the Google Doc -- I will not repeat those here.)
I like the proposal a lot, and I am glad that we now have (apparently) a working prototype in Pyre. This has been on our wish list since at least 2016 -- much early discussion happened in https://github.com/python/typing/issues/193 and at various meetings at PyCon and at the Bay Area Typing Meetups (links in the PEP). The proposed syntax has cycled through endless variations, and I am fine with the current proposal, even though it is still slightly clunky. There's also https://github.com/python/typing/issues/513, which is specifically about array types.
There are probably other motivating applications that the PEP doesn't mention, for example certain decorator types (I doubt that all of these are taken care by PEP 612, ParamSpec).
I wonder why the proposal left out `Union[*Ts]`. This would seem useful, e.g. to type this function: ``` def f(*args): return random.choice(args) ``` which could be typed naturally as follows: ``` def f(*args: *Ts) -> Union[*Ts]: return random.choice(args) ```
I'm not sure that `Tensor[T1, T2, ...]` is the be-all and end-all of tensor types (e.g. where would you put the data type of numpy arrays?) but maybe that can be handled by just adding one non-variadic type variable (a complete example would be nice though). There are also proposals for integer generics, which deserve their own PEP (presumably aiming at Python 3.11).
Eric Traut proposed an extension that would allow defining variadic subtypes of Sequence which behave similar to Tuple (where `Tuple[int, int]` is a subtype of `Tuple[int, ...]` which is a subtype of `Sequence[int]`), but I'm not sure we would need that a lot -- we could always add that later.
The introduction of a prefix `*` operator requires new syntax in a few cases. While `Callable[[*Ts], T]` is already valid (the parser interprets this as sequence unpacking), `Tuple[*Ts]` is not, and neither is `def f(*a: *Ts)`. For `Tuple[*Ts]` we can piggy-back on PEP 637 (keyword indexing, which adds this as well), but for the `def` example we'll need to add something new specifically for this PEP. I think that's fine -- we can give it runtime semantics that turns `*Ts` into `(*Ts,)`, which is similar to the other places: at runtime it iterates over the argument, producing a tuple. In all cases we need to support `Expand[Ts]` as well for backwards compatibility with Python 3.9 and before.
The `Map[]` operator is, as I said, fairly clunky. In the past various other syntaxes have been proposed. In particular, @sixolet p <https://github.com/python/typing/issues/513>roposed a syntax that would allow defining `zip()` as follows: ``` def zip(*args: Iterable[Ts]) -> Iterator[Tuple[Ts, ...]]: ... ``` Compare this to what it would look using the current proposal: ``` def zip(*args: *Map[Iterable, Ts]) -> Iterator[Ts]): ... # Note that Iterator[Ts] is the same as Iterator[Tuple[*Ts]] ``` Sixolet's syntax made the iteration over the elements of Ts implicit, which is slightly shorter, and doesn't require "higher-order type functions" (is there an official name for that?), but also slightly more cryptic, and created yet another use for the ellipsis: `Tuple[Ts, ...]` is not quite analogous to `Tuple[T, ...]`, since the latter is *homogeneous* while the former is still heterogeneous. The new notation uses an explicit `Map[]` operator, which is similar to the choice we made in PEP 612 for `Concatenate[]`. (Speaking of this choice, we could drop the `*` prefix and rely purely on `Expand[]`, but that feels unnecessarily verbose, and we'll get most of the needed syntax for free with PEP 637, assuming it's accepted.)
All in all my recommendation for this PEP is: clean up the text based on the GDoc feedback, add `Union[*Ts]`, and submit to the Steering Council.
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/> _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: mrahtz@google.com
P.S. If we did go for Option 3, we could still make classes generic in multiple type tuple variables by using the explicit syntax: ```python class C(Generic[Tuple[*Ts1], Tuple[*Ts2]): ... c: C[Tuple[int, str], Tuple[float]] ``` (Aside: this was previously written in the PEP using 'unexpanded' type tuple variables: ```python class C(Generic[Ts1, Ts2]): ... ``` But having thought on it for a week, I think allowing unexpanded type tuple variables would be a mistake - it would mean there are two ways to write the following: ```python def identity(x: Tuple[*Ts]) -> Tuple[*Ts]: ... # could also be written as def identity(x: Ts) -> Ts: ... ``` This is a) un-Pythonic, and b) undermines one of the reasons we thought of using the star operator here in the first place: to make it clear when the thing in question is a type variable *tuple* rather than just a plain type variable.) On Wed, 23 Dec 2020 at 14:33, Matthew Rahtz <mrahtz@google.com> wrote:
(Moving some discussion here from the doc)
# Concatenating type variable tuples with other types
In all cases we should also support extra single types before and after, e.g. Tuple[int, *Ts, str].
In cases where there's only a single type variable tuple, it should be fine to allow an *arbitrary* number of concrete types before and after, shouldn't it?
```python def foo(t: Tuple[int, str, *Ts, double]) -> Tuple[*Ts]: ... t: Tuple[int, str, float, bool, double] foo(t) # Return has type Tuple[float, bool] ```
# Multiple type variable tuples
Thinking out loud - let's get straight about all the places this could occur.
## Function arguments
Example 1: ```python def func(spam: Tuple[*Ts1, *Ts2]): ... spam: Tuple[int, str, bool] func(spam) ```
This wouldn't work: how would we decide which types were bound to `Ts1` and which were bound to `Ts2`? (Ignore the fact that type variables are only used once in the signature here.)
On the other hand, it *would* work if there were extra constraints - say, from other arguments whose type was unambiguous:
Example 2: ```python def func(ham: Tuple[*Ts1], spam: Tuple[*Ts1, *Ts2]): ... ham: Tuple[int, float] spam: Tuple[int, float, double, str] func(ham, spam) ```
**Conclusion: sometimes alright, sometimes not.**
## Function returns
Can this work?
Example 3: ```python def foo() -> Tuple[*Ts1, *Ts]: return 0, 0.0, '0' ```
On the face of it, we have the same problem. But in practice, we'd never encounter this example, because `Ts1` and `Ts2` would have had to occur somewhere else in the signature, which would have nailed them down:
Example 4: ```python def foo(ham: Tuple[*Ts1], spam: Tuple[*Ts2]) -> Tuple[*Ts1, *Ts2]: ... ham: Tuple[int, str] spam: Tuple[float, double] foo(ham, spam) # Inferred type is Tuple[int, str, float, double] ```
**Conclusion: always fine.**
## Classes
Example 5: ```python class C(Generic[*Ts1, *Ts2]): ... c: C[int, str, float] = C() ```
Same problem as function arguments. And this time, I don't think there's any way to add extra constraints to disambiguate.
**Conclusion: never alright.**
If, for some reason, we did want a class that was generic in multiple type tuple variables, the current proposal in the PEP is:
Example 6: ```python class C(Generic[Ts1, Ts2]): ... c: C[Tuple[int, str], Tuple[float]] = C() # Great! c: C[int, str, float] # Not allowed ```
---
OK, so the example that Pradeep suggested...
```python def partial(f: Callable[[*Ts, *Rs], T], *some_args: *Ts) -> Callable[[*Rs], T]: ... ```
...is similar to Example 2: the ``Callable`` is ambiguous on its own, but there's extra context in the rest of the signature which disambiguates it.
---
So overall, the three options I see are:
* Option 1: Disallow multiple expanded type variables tuples everywhere, for consistency and ease-of-understanding * Option 2: Only allow multiple expanded type variable tuples in contexts where it's *always* unambiguous - i.e. only in return types. * Option 3: Allow multiple expanded type variable tuples in general, but have the type checker produce an error when the types cannot be solved for.
Thoughts?
On Wed, 23 Dec 2020 at 11:02, Matthew Rahtz <mrahtz@google.com> wrote:
Thank you for sponsoring this, Guido, and for the thorough review!
I wonder why the proposal left out `Union[*Ts]`.
Ah, yes, great point. I'll add a section on that.
I'm not sure that `Tensor[T1, T2, ...]` is the be-all and end-all of tensor types (e.g. where would you put the data type of numpy arrays?) but maybe that can be handled by just adding one non-variadic type variable (a complete example would be nice though).
This has been on the back of my mind too. Adding a single additional non-variadic type variable is how I was imagining it would work too, though there are still some details to work out (e.g. ideally it should be optional so that people can choose what level of type verbosity they want to go with). I'll add a section trying to figure this out.
The other thing that's still unresolved is how we handle access to individual types - needed so that we can provide overloads of shape-manipulating operations. (I'm assuming that overloads are the way to go here, at least for the time being. In an ideal world we would be able to express the resulting shapes directly as a function of the arguments, but I don't think that'll be possible without fully dependent typing). My initial idea was to do this using "class overloads":
```python class Tensor(Generic[*Shape]): ...
@overload class Tensor(Generic[Axis1, Axis2]): def transpose(self) -> Tensor[Axis2, Axis1]: ...
@overload class Tensor(Generic[Axis1, Axis2, Axis3]): def transpose(self) -> Tensor[Axis3, Axis2, Axis1]: ... ```
But you're right in calling this out in the draft doc as non-trivial. It's also very verbose, requiring a whole separate class for each possible instantiation.
Instead, perhaps the following would suffice?
```python class Tensor(Generic[*Shape]):
@overload def transpose(self: Tensor[Axis1, Axis2]) -> Tensor[Axis2, Axis1]: ...
@overload def transpose(self: Tensor[Axis1, Axis2, Axis3]) -> Tensor[Axis3, Axis2, Axis1]: ... ```
This is similar to the following example, which already seems to type-check properly in mypy:
```python class C(Generic[T]):
@overload def f(self: C[int], x) -> int: return x
@overload def f(self: C[str], x) -> str: return x ```
I'd welcome other suggestions, though!
In any case, I'll continue cleaning up the doc as suggested, moving discussion of meatier issues to this thread for posterity, and post here once I think the doc is done.
On Tue, 22 Dec 2020 at 23:46, Guido van Rossum <guido@python.org> wrote:
I have read the proposed PEP about variadic generics (PEP 646) and I like it enough that I want to sponsor it and want to help getting it over the finish line (we have to get the Steering Council to understand enough of it that they'll delegate approval to me :-).
For reference, here's the PR that proposes to add PEP 646: https://github.com/python/peps/pull/1740 And here's the original Google Doc: https://docs.google.com/document/d/1oXWyAtnv0-pbyJud8H5wkpIk8aajbkX-leJ8JXsE...
A good review starts by briefly summarizing the proposal being reviewed, so here's the proposal in my own words.
**Motivation A:** We want to create generic types that take an arbitrary number of type parameters, like Tuple. For example, Tensors where each dimension is a "type". There is a demonstration of this without variadics, but it requires defining types `Tensor1[T1]`, `Tensor2[T1, T2]`, etc.: https://github.com/deepmind/tensor_annotations. We want just `Tensor[T1]`, `Tensor[T1, T2]`, etc., for any number of parameters.
**Motivation B:** The type of functions like map() and zip() cannot be expressed using the existing type system. The simplest example would be the type of ``` def foo(*args): return args a = foo(42, "abc") # Should have type Tuple[int, str] ```
**Proposal:** Introduce a new kind of type variable that can be instantiated with an arbitrary number of types, some new syntax, and a new type operator: ``` Ts = TypeVarTuple("Ts") # NEW T = TypeVar("T")
def f(*args: *Ts) -> Tuple[*Ts]: ... class C(Generic[*Ts]): ... Callable[[*Ts], T] Tuple[*Ts]
Map[SomeType, Ts] # SomeType is a generic of one parameter ``` In most cases the form `*Ts` may be preceded and/or followed by any number of non-variadic types, e.g., `Tuple[int, int, *Ts, str]`. In cases where it's unambiguous, multiple variadic type variables are also allowed, e.g., `Tuple[*Ts1, *Ts2]`. For older Python versions, `Expand[Ts]` would mean the same as `*Ts`.
So now let me go on with my (generally favorable) review. (I left many detailed editorial comments in the Google Doc -- I will not repeat those here.)
I like the proposal a lot, and I am glad that we now have (apparently) a working prototype in Pyre. This has been on our wish list since at least 2016 -- much early discussion happened in https://github.com/python/typing/issues/193 and at various meetings at PyCon and at the Bay Area Typing Meetups (links in the PEP). The proposed syntax has cycled through endless variations, and I am fine with the current proposal, even though it is still slightly clunky. There's also https://github.com/python/typing/issues/513, which is specifically about array types.
There are probably other motivating applications that the PEP doesn't mention, for example certain decorator types (I doubt that all of these are taken care by PEP 612, ParamSpec).
I wonder why the proposal left out `Union[*Ts]`. This would seem useful, e.g. to type this function: ``` def f(*args): return random.choice(args) ``` which could be typed naturally as follows: ``` def f(*args: *Ts) -> Union[*Ts]: return random.choice(args) ```
I'm not sure that `Tensor[T1, T2, ...]` is the be-all and end-all of tensor types (e.g. where would you put the data type of numpy arrays?) but maybe that can be handled by just adding one non-variadic type variable (a complete example would be nice though). There are also proposals for integer generics, which deserve their own PEP (presumably aiming at Python 3.11).
Eric Traut proposed an extension that would allow defining variadic subtypes of Sequence which behave similar to Tuple (where `Tuple[int, int]` is a subtype of `Tuple[int, ...]` which is a subtype of `Sequence[int]`), but I'm not sure we would need that a lot -- we could always add that later.
The introduction of a prefix `*` operator requires new syntax in a few cases. While `Callable[[*Ts], T]` is already valid (the parser interprets this as sequence unpacking), `Tuple[*Ts]` is not, and neither is `def f(*a: *Ts)`. For `Tuple[*Ts]` we can piggy-back on PEP 637 (keyword indexing, which adds this as well), but for the `def` example we'll need to add something new specifically for this PEP. I think that's fine -- we can give it runtime semantics that turns `*Ts` into `(*Ts,)`, which is similar to the other places: at runtime it iterates over the argument, producing a tuple. In all cases we need to support `Expand[Ts]` as well for backwards compatibility with Python 3.9 and before.
The `Map[]` operator is, as I said, fairly clunky. In the past various other syntaxes have been proposed. In particular, @sixolet p <https://github.com/python/typing/issues/513>roposed a syntax that would allow defining `zip()` as follows: ``` def zip(*args: Iterable[Ts]) -> Iterator[Tuple[Ts, ...]]: ... ``` Compare this to what it would look using the current proposal: ``` def zip(*args: *Map[Iterable, Ts]) -> Iterator[Ts]): ... # Note that Iterator[Ts] is the same as Iterator[Tuple[*Ts]] ``` Sixolet's syntax made the iteration over the elements of Ts implicit, which is slightly shorter, and doesn't require "higher-order type functions" (is there an official name for that?), but also slightly more cryptic, and created yet another use for the ellipsis: `Tuple[Ts, ...]` is not quite analogous to `Tuple[T, ...]`, since the latter is *homogeneous* while the former is still heterogeneous. The new notation uses an explicit `Map[]` operator, which is similar to the choice we made in PEP 612 for `Concatenate[]`. (Speaking of this choice, we could drop the `*` prefix and rely purely on `Expand[]`, but that feels unnecessarily verbose, and we'll get most of the needed syntax for free with PEP 637, assuming it's accepted.)
All in all my recommendation for this PEP is: clean up the text based on the GDoc feedback, add `Union[*Ts]`, and submit to the Steering Council.
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/> _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: mrahtz@google.com
P.P.S. Pradeep - did you say you had an example of where a class generic in multiple type variable tuples *would* be necessary? On Wed, 23 Dec 2020 at 15:01, Matthew Rahtz <mrahtz@google.com> wrote:
P.S. If we did go for Option 3, we could still make classes generic in multiple type tuple variables by using the explicit syntax:
```python class C(Generic[Tuple[*Ts1], Tuple[*Ts2]): ...
c: C[Tuple[int, str], Tuple[float]] ```
(Aside: this was previously written in the PEP using 'unexpanded' type tuple variables:
```python class C(Generic[Ts1, Ts2]): ... ```
But having thought on it for a week, I think allowing unexpanded type tuple variables would be a mistake - it would mean there are two ways to write the following:
```python def identity(x: Tuple[*Ts]) -> Tuple[*Ts]: ... # could also be written as def identity(x: Ts) -> Ts: ... ```
This is a) un-Pythonic, and b) undermines one of the reasons we thought of using the star operator here in the first place: to make it clear when the thing in question is a type variable *tuple* rather than just a plain type variable.)
On Wed, 23 Dec 2020 at 14:33, Matthew Rahtz <mrahtz@google.com> wrote:
(Moving some discussion here from the doc)
# Concatenating type variable tuples with other types
In all cases we should also support extra single types before and after, e.g. Tuple[int, *Ts, str].
In cases where there's only a single type variable tuple, it should be fine to allow an *arbitrary* number of concrete types before and after, shouldn't it?
```python def foo(t: Tuple[int, str, *Ts, double]) -> Tuple[*Ts]: ... t: Tuple[int, str, float, bool, double] foo(t) # Return has type Tuple[float, bool] ```
# Multiple type variable tuples
Thinking out loud - let's get straight about all the places this could occur.
## Function arguments
Example 1: ```python def func(spam: Tuple[*Ts1, *Ts2]): ... spam: Tuple[int, str, bool] func(spam) ```
This wouldn't work: how would we decide which types were bound to `Ts1` and which were bound to `Ts2`? (Ignore the fact that type variables are only used once in the signature here.)
On the other hand, it *would* work if there were extra constraints - say, from other arguments whose type was unambiguous:
Example 2: ```python def func(ham: Tuple[*Ts1], spam: Tuple[*Ts1, *Ts2]): ... ham: Tuple[int, float] spam: Tuple[int, float, double, str] func(ham, spam) ```
**Conclusion: sometimes alright, sometimes not.**
## Function returns
Can this work?
Example 3: ```python def foo() -> Tuple[*Ts1, *Ts]: return 0, 0.0, '0' ```
On the face of it, we have the same problem. But in practice, we'd never encounter this example, because `Ts1` and `Ts2` would have had to occur somewhere else in the signature, which would have nailed them down:
Example 4: ```python def foo(ham: Tuple[*Ts1], spam: Tuple[*Ts2]) -> Tuple[*Ts1, *Ts2]: ... ham: Tuple[int, str] spam: Tuple[float, double] foo(ham, spam) # Inferred type is Tuple[int, str, float, double] ```
**Conclusion: always fine.**
## Classes
Example 5: ```python class C(Generic[*Ts1, *Ts2]): ... c: C[int, str, float] = C() ```
Same problem as function arguments. And this time, I don't think there's any way to add extra constraints to disambiguate.
**Conclusion: never alright.**
If, for some reason, we did want a class that was generic in multiple type tuple variables, the current proposal in the PEP is:
Example 6: ```python class C(Generic[Ts1, Ts2]): ... c: C[Tuple[int, str], Tuple[float]] = C() # Great! c: C[int, str, float] # Not allowed ```
---
OK, so the example that Pradeep suggested...
```python def partial(f: Callable[[*Ts, *Rs], T], *some_args: *Ts) -> Callable[[*Rs], T]: ... ```
...is similar to Example 2: the ``Callable`` is ambiguous on its own, but there's extra context in the rest of the signature which disambiguates it.
---
So overall, the three options I see are:
* Option 1: Disallow multiple expanded type variables tuples everywhere, for consistency and ease-of-understanding * Option 2: Only allow multiple expanded type variable tuples in contexts where it's *always* unambiguous - i.e. only in return types. * Option 3: Allow multiple expanded type variable tuples in general, but have the type checker produce an error when the types cannot be solved for.
Thoughts?
On Wed, 23 Dec 2020 at 11:02, Matthew Rahtz <mrahtz@google.com> wrote:
Thank you for sponsoring this, Guido, and for the thorough review!
I wonder why the proposal left out `Union[*Ts]`.
Ah, yes, great point. I'll add a section on that.
I'm not sure that `Tensor[T1, T2, ...]` is the be-all and end-all of tensor types (e.g. where would you put the data type of numpy arrays?) but maybe that can be handled by just adding one non-variadic type variable (a complete example would be nice though).
This has been on the back of my mind too. Adding a single additional non-variadic type variable is how I was imagining it would work too, though there are still some details to work out (e.g. ideally it should be optional so that people can choose what level of type verbosity they want to go with). I'll add a section trying to figure this out.
The other thing that's still unresolved is how we handle access to individual types - needed so that we can provide overloads of shape-manipulating operations. (I'm assuming that overloads are the way to go here, at least for the time being. In an ideal world we would be able to express the resulting shapes directly as a function of the arguments, but I don't think that'll be possible without fully dependent typing). My initial idea was to do this using "class overloads":
```python class Tensor(Generic[*Shape]): ...
@overload class Tensor(Generic[Axis1, Axis2]): def transpose(self) -> Tensor[Axis2, Axis1]: ...
@overload class Tensor(Generic[Axis1, Axis2, Axis3]): def transpose(self) -> Tensor[Axis3, Axis2, Axis1]: ... ```
But you're right in calling this out in the draft doc as non-trivial. It's also very verbose, requiring a whole separate class for each possible instantiation.
Instead, perhaps the following would suffice?
```python class Tensor(Generic[*Shape]):
@overload def transpose(self: Tensor[Axis1, Axis2]) -> Tensor[Axis2, Axis1]: ...
@overload def transpose(self: Tensor[Axis1, Axis2, Axis3]) -> Tensor[Axis3, Axis2, Axis1]: ... ```
This is similar to the following example, which already seems to type-check properly in mypy:
```python class C(Generic[T]):
@overload def f(self: C[int], x) -> int: return x
@overload def f(self: C[str], x) -> str: return x ```
I'd welcome other suggestions, though!
In any case, I'll continue cleaning up the doc as suggested, moving discussion of meatier issues to this thread for posterity, and post here once I think the doc is done.
On Tue, 22 Dec 2020 at 23:46, Guido van Rossum <guido@python.org> wrote:
I have read the proposed PEP about variadic generics (PEP 646) and I like it enough that I want to sponsor it and want to help getting it over the finish line (we have to get the Steering Council to understand enough of it that they'll delegate approval to me :-).
For reference, here's the PR that proposes to add PEP 646: https://github.com/python/peps/pull/1740 And here's the original Google Doc: https://docs.google.com/document/d/1oXWyAtnv0-pbyJud8H5wkpIk8aajbkX-leJ8JXsE...
A good review starts by briefly summarizing the proposal being reviewed, so here's the proposal in my own words.
**Motivation A:** We want to create generic types that take an arbitrary number of type parameters, like Tuple. For example, Tensors where each dimension is a "type". There is a demonstration of this without variadics, but it requires defining types `Tensor1[T1]`, `Tensor2[T1, T2]`, etc.: https://github.com/deepmind/tensor_annotations. We want just `Tensor[T1]`, `Tensor[T1, T2]`, etc., for any number of parameters.
**Motivation B:** The type of functions like map() and zip() cannot be expressed using the existing type system. The simplest example would be the type of ``` def foo(*args): return args a = foo(42, "abc") # Should have type Tuple[int, str] ```
**Proposal:** Introduce a new kind of type variable that can be instantiated with an arbitrary number of types, some new syntax, and a new type operator: ``` Ts = TypeVarTuple("Ts") # NEW T = TypeVar("T")
def f(*args: *Ts) -> Tuple[*Ts]: ... class C(Generic[*Ts]): ... Callable[[*Ts], T] Tuple[*Ts]
Map[SomeType, Ts] # SomeType is a generic of one parameter ``` In most cases the form `*Ts` may be preceded and/or followed by any number of non-variadic types, e.g., `Tuple[int, int, *Ts, str]`. In cases where it's unambiguous, multiple variadic type variables are also allowed, e.g., `Tuple[*Ts1, *Ts2]`. For older Python versions, `Expand[Ts]` would mean the same as `*Ts`.
So now let me go on with my (generally favorable) review. (I left many detailed editorial comments in the Google Doc -- I will not repeat those here.)
I like the proposal a lot, and I am glad that we now have (apparently) a working prototype in Pyre. This has been on our wish list since at least 2016 -- much early discussion happened in https://github.com/python/typing/issues/193 and at various meetings at PyCon and at the Bay Area Typing Meetups (links in the PEP). The proposed syntax has cycled through endless variations, and I am fine with the current proposal, even though it is still slightly clunky. There's also https://github.com/python/typing/issues/513, which is specifically about array types.
There are probably other motivating applications that the PEP doesn't mention, for example certain decorator types (I doubt that all of these are taken care by PEP 612, ParamSpec).
I wonder why the proposal left out `Union[*Ts]`. This would seem useful, e.g. to type this function: ``` def f(*args): return random.choice(args) ``` which could be typed naturally as follows: ``` def f(*args: *Ts) -> Union[*Ts]: return random.choice(args) ```
I'm not sure that `Tensor[T1, T2, ...]` is the be-all and end-all of tensor types (e.g. where would you put the data type of numpy arrays?) but maybe that can be handled by just adding one non-variadic type variable (a complete example would be nice though). There are also proposals for integer generics, which deserve their own PEP (presumably aiming at Python 3.11).
Eric Traut proposed an extension that would allow defining variadic subtypes of Sequence which behave similar to Tuple (where `Tuple[int, int]` is a subtype of `Tuple[int, ...]` which is a subtype of `Sequence[int]`), but I'm not sure we would need that a lot -- we could always add that later.
The introduction of a prefix `*` operator requires new syntax in a few cases. While `Callable[[*Ts], T]` is already valid (the parser interprets this as sequence unpacking), `Tuple[*Ts]` is not, and neither is `def f(*a: *Ts)`. For `Tuple[*Ts]` we can piggy-back on PEP 637 (keyword indexing, which adds this as well), but for the `def` example we'll need to add something new specifically for this PEP. I think that's fine -- we can give it runtime semantics that turns `*Ts` into `(*Ts,)`, which is similar to the other places: at runtime it iterates over the argument, producing a tuple. In all cases we need to support `Expand[Ts]` as well for backwards compatibility with Python 3.9 and before.
The `Map[]` operator is, as I said, fairly clunky. In the past various other syntaxes have been proposed. In particular, @sixolet p <https://github.com/python/typing/issues/513>roposed a syntax that would allow defining `zip()` as follows: ``` def zip(*args: Iterable[Ts]) -> Iterator[Tuple[Ts, ...]]: ... ``` Compare this to what it would look using the current proposal: ``` def zip(*args: *Map[Iterable, Ts]) -> Iterator[Ts]): ... # Note that Iterator[Ts] is the same as Iterator[Tuple[*Ts]] ``` Sixolet's syntax made the iteration over the elements of Ts implicit, which is slightly shorter, and doesn't require "higher-order type functions" (is there an official name for that?), but also slightly more cryptic, and created yet another use for the ellipsis: `Tuple[Ts, ...]` is not quite analogous to `Tuple[T, ...]`, since the latter is *homogeneous* while the former is still heterogeneous. The new notation uses an explicit `Map[]` operator, which is similar to the choice we made in PEP 612 for `Concatenate[]`. (Speaking of this choice, we could drop the `*` prefix and rely purely on `Expand[]`, but that feels unnecessarily verbose, and we'll get most of the needed syntax for free with PEP 637, assuming it's accepted.)
All in all my recommendation for this PEP is: clean up the text based on the GDoc feedback, add `Union[*Ts]`, and submit to the Steering Council.
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/> _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: mrahtz@google.com
Ugh, sorry to flip-flop - it's only in finding a big enough chunk of time to really sit down and work on this again that things are becoming clearer...
I think allowing unexpanded type tuple variables would be a mistake
I disagree, past Matthew! :) First, it makes the mental model really confusing. If we were to say that `Tuple[*Ts]` is valid, but `Ts` on its own isn't, then `Ts` is somehow a tuple of types that is nonetheless not an actual `Tuple`. We would be saying that `Ts` behaves like a parameterised tuple, but can only be used exactly like one in certain cases. Relatedly, we do still have to allow unexpanded type variable tuples *sometimes* - e.g. `Map`: ```python Map[List, Ts] ``` If we *were* to disallow unexpanded type variables tuples in general, then the rule would have to be something like "Type variable tuples should always be used expanded, except in `Map`", which seems horribly kludgy.
a) un-Pythonic
If the issue is that there should only be one way to do it, then fine - we encourage users to use plain `Ts` in the first place, instead of `Tuple[*Ts]`.
b) undermines one of the reasons we thought of using the star operator here in the first place: to make it clear when the thing in question is a type variable *tuple* rather than just a plain type variable.
I think this is worth giving up for the sake of a more consistent mental model about what kind of a thing a type variable tuple is. Plus, we'll still be using the star in *most* cases. A final argument in favour of allowing unexpanded type variables is that it saves on keystrokes and verbosity. `-> Ts` is nicer than `-> Tuple[*Ts]`. :) So in summary, we'd write: ```python class Tensor(Generic[*Shape]): ... t: Tensor[Height, Width] class Tensor2(Generic[Shape]): ... t2: Tensor[Tuple[Height, Width]] class MultiTensor(Generic[Shape1, Shape]): ... mt: MultiTensor[Tuple[Time, Batch], Tuple[Height, Width]] def args_to_tuples(*args: *Ts) -> Ts: ... class Process: def __init__(target: Callable[[*Ts], Any], args: Ts): ... ``` On Wed, 23 Dec 2020 at 15:02, Matthew Rahtz <mrahtz@google.com> wrote:
P.P.S. Pradeep - did you say you had an example of where a class generic in multiple type variable tuples *would* be necessary?
On Wed, 23 Dec 2020 at 15:01, Matthew Rahtz <mrahtz@google.com> wrote:
P.S. If we did go for Option 3, we could still make classes generic in multiple type tuple variables by using the explicit syntax:
```python class C(Generic[Tuple[*Ts1], Tuple[*Ts2]): ...
c: C[Tuple[int, str], Tuple[float]] ```
(Aside: this was previously written in the PEP using 'unexpanded' type tuple variables:
```python class C(Generic[Ts1, Ts2]): ... ```
But having thought on it for a week, I think allowing unexpanded type tuple variables would be a mistake - it would mean there are two ways to write the following:
```python def identity(x: Tuple[*Ts]) -> Tuple[*Ts]: ... # could also be written as def identity(x: Ts) -> Ts: ... ```
This is a) un-Pythonic, and b) undermines one of the reasons we thought of using the star operator here in the first place: to make it clear when the thing in question is a type variable *tuple* rather than just a plain type variable.)
On Wed, 23 Dec 2020 at 14:33, Matthew Rahtz <mrahtz@google.com> wrote:
(Moving some discussion here from the doc)
# Concatenating type variable tuples with other types
In all cases we should also support extra single types before and after, e.g. Tuple[int, *Ts, str].
In cases where there's only a single type variable tuple, it should be fine to allow an *arbitrary* number of concrete types before and after, shouldn't it?
```python def foo(t: Tuple[int, str, *Ts, double]) -> Tuple[*Ts]: ... t: Tuple[int, str, float, bool, double] foo(t) # Return has type Tuple[float, bool] ```
# Multiple type variable tuples
Thinking out loud - let's get straight about all the places this could occur.
## Function arguments
Example 1: ```python def func(spam: Tuple[*Ts1, *Ts2]): ... spam: Tuple[int, str, bool] func(spam) ```
This wouldn't work: how would we decide which types were bound to `Ts1` and which were bound to `Ts2`? (Ignore the fact that type variables are only used once in the signature here.)
On the other hand, it *would* work if there were extra constraints - say, from other arguments whose type was unambiguous:
Example 2: ```python def func(ham: Tuple[*Ts1], spam: Tuple[*Ts1, *Ts2]): ... ham: Tuple[int, float] spam: Tuple[int, float, double, str] func(ham, spam) ```
**Conclusion: sometimes alright, sometimes not.**
## Function returns
Can this work?
Example 3: ```python def foo() -> Tuple[*Ts1, *Ts]: return 0, 0.0, '0' ```
On the face of it, we have the same problem. But in practice, we'd never encounter this example, because `Ts1` and `Ts2` would have had to occur somewhere else in the signature, which would have nailed them down:
Example 4: ```python def foo(ham: Tuple[*Ts1], spam: Tuple[*Ts2]) -> Tuple[*Ts1, *Ts2]: ... ham: Tuple[int, str] spam: Tuple[float, double] foo(ham, spam) # Inferred type is Tuple[int, str, float, double] ```
**Conclusion: always fine.**
## Classes
Example 5: ```python class C(Generic[*Ts1, *Ts2]): ... c: C[int, str, float] = C() ```
Same problem as function arguments. And this time, I don't think there's any way to add extra constraints to disambiguate.
**Conclusion: never alright.**
If, for some reason, we did want a class that was generic in multiple type tuple variables, the current proposal in the PEP is:
Example 6: ```python class C(Generic[Ts1, Ts2]): ... c: C[Tuple[int, str], Tuple[float]] = C() # Great! c: C[int, str, float] # Not allowed ```
---
OK, so the example that Pradeep suggested...
```python def partial(f: Callable[[*Ts, *Rs], T], *some_args: *Ts) -> Callable[[*Rs], T]: ... ```
...is similar to Example 2: the ``Callable`` is ambiguous on its own, but there's extra context in the rest of the signature which disambiguates it.
---
So overall, the three options I see are:
* Option 1: Disallow multiple expanded type variables tuples everywhere, for consistency and ease-of-understanding * Option 2: Only allow multiple expanded type variable tuples in contexts where it's *always* unambiguous - i.e. only in return types. * Option 3: Allow multiple expanded type variable tuples in general, but have the type checker produce an error when the types cannot be solved for.
Thoughts?
On Wed, 23 Dec 2020 at 11:02, Matthew Rahtz <mrahtz@google.com> wrote:
Thank you for sponsoring this, Guido, and for the thorough review!
I wonder why the proposal left out `Union[*Ts]`.
Ah, yes, great point. I'll add a section on that.
I'm not sure that `Tensor[T1, T2, ...]` is the be-all and end-all of tensor types (e.g. where would you put the data type of numpy arrays?) but maybe that can be handled by just adding one non-variadic type variable (a complete example would be nice though).
This has been on the back of my mind too. Adding a single additional non-variadic type variable is how I was imagining it would work too, though there are still some details to work out (e.g. ideally it should be optional so that people can choose what level of type verbosity they want to go with). I'll add a section trying to figure this out.
The other thing that's still unresolved is how we handle access to individual types - needed so that we can provide overloads of shape-manipulating operations. (I'm assuming that overloads are the way to go here, at least for the time being. In an ideal world we would be able to express the resulting shapes directly as a function of the arguments, but I don't think that'll be possible without fully dependent typing). My initial idea was to do this using "class overloads":
```python class Tensor(Generic[*Shape]): ...
@overload class Tensor(Generic[Axis1, Axis2]): def transpose(self) -> Tensor[Axis2, Axis1]: ...
@overload class Tensor(Generic[Axis1, Axis2, Axis3]): def transpose(self) -> Tensor[Axis3, Axis2, Axis1]: ... ```
But you're right in calling this out in the draft doc as non-trivial. It's also very verbose, requiring a whole separate class for each possible instantiation.
Instead, perhaps the following would suffice?
```python class Tensor(Generic[*Shape]):
@overload def transpose(self: Tensor[Axis1, Axis2]) -> Tensor[Axis2, Axis1]: ...
@overload def transpose(self: Tensor[Axis1, Axis2, Axis3]) -> Tensor[Axis3, Axis2, Axis1]: ... ```
This is similar to the following example, which already seems to type-check properly in mypy:
```python class C(Generic[T]):
@overload def f(self: C[int], x) -> int: return x
@overload def f(self: C[str], x) -> str: return x ```
I'd welcome other suggestions, though!
In any case, I'll continue cleaning up the doc as suggested, moving discussion of meatier issues to this thread for posterity, and post here once I think the doc is done.
On Tue, 22 Dec 2020 at 23:46, Guido van Rossum <guido@python.org> wrote:
I have read the proposed PEP about variadic generics (PEP 646) and I like it enough that I want to sponsor it and want to help getting it over the finish line (we have to get the Steering Council to understand enough of it that they'll delegate approval to me :-).
For reference, here's the PR that proposes to add PEP 646: https://github.com/python/peps/pull/1740 And here's the original Google Doc: https://docs.google.com/document/d/1oXWyAtnv0-pbyJud8H5wkpIk8aajbkX-leJ8JXsE...
A good review starts by briefly summarizing the proposal being reviewed, so here's the proposal in my own words.
**Motivation A:** We want to create generic types that take an arbitrary number of type parameters, like Tuple. For example, Tensors where each dimension is a "type". There is a demonstration of this without variadics, but it requires defining types `Tensor1[T1]`, `Tensor2[T1, T2]`, etc.: https://github.com/deepmind/tensor_annotations. We want just `Tensor[T1]`, `Tensor[T1, T2]`, etc., for any number of parameters.
**Motivation B:** The type of functions like map() and zip() cannot be expressed using the existing type system. The simplest example would be the type of ``` def foo(*args): return args a = foo(42, "abc") # Should have type Tuple[int, str] ```
**Proposal:** Introduce a new kind of type variable that can be instantiated with an arbitrary number of types, some new syntax, and a new type operator: ``` Ts = TypeVarTuple("Ts") # NEW T = TypeVar("T")
def f(*args: *Ts) -> Tuple[*Ts]: ... class C(Generic[*Ts]): ... Callable[[*Ts], T] Tuple[*Ts]
Map[SomeType, Ts] # SomeType is a generic of one parameter ``` In most cases the form `*Ts` may be preceded and/or followed by any number of non-variadic types, e.g., `Tuple[int, int, *Ts, str]`. In cases where it's unambiguous, multiple variadic type variables are also allowed, e.g., `Tuple[*Ts1, *Ts2]`. For older Python versions, `Expand[Ts]` would mean the same as `*Ts`.
So now let me go on with my (generally favorable) review. (I left many detailed editorial comments in the Google Doc -- I will not repeat those here.)
I like the proposal a lot, and I am glad that we now have (apparently) a working prototype in Pyre. This has been on our wish list since at least 2016 -- much early discussion happened in https://github.com/python/typing/issues/193 and at various meetings at PyCon and at the Bay Area Typing Meetups (links in the PEP). The proposed syntax has cycled through endless variations, and I am fine with the current proposal, even though it is still slightly clunky. There's also https://github.com/python/typing/issues/513, which is specifically about array types.
There are probably other motivating applications that the PEP doesn't mention, for example certain decorator types (I doubt that all of these are taken care by PEP 612, ParamSpec).
I wonder why the proposal left out `Union[*Ts]`. This would seem useful, e.g. to type this function: ``` def f(*args): return random.choice(args) ``` which could be typed naturally as follows: ``` def f(*args: *Ts) -> Union[*Ts]: return random.choice(args) ```
I'm not sure that `Tensor[T1, T2, ...]` is the be-all and end-all of tensor types (e.g. where would you put the data type of numpy arrays?) but maybe that can be handled by just adding one non-variadic type variable (a complete example would be nice though). There are also proposals for integer generics, which deserve their own PEP (presumably aiming at Python 3.11).
Eric Traut proposed an extension that would allow defining variadic subtypes of Sequence which behave similar to Tuple (where `Tuple[int, int]` is a subtype of `Tuple[int, ...]` which is a subtype of `Sequence[int]`), but I'm not sure we would need that a lot -- we could always add that later.
The introduction of a prefix `*` operator requires new syntax in a few cases. While `Callable[[*Ts], T]` is already valid (the parser interprets this as sequence unpacking), `Tuple[*Ts]` is not, and neither is `def f(*a: *Ts)`. For `Tuple[*Ts]` we can piggy-back on PEP 637 (keyword indexing, which adds this as well), but for the `def` example we'll need to add something new specifically for this PEP. I think that's fine -- we can give it runtime semantics that turns `*Ts` into `(*Ts,)`, which is similar to the other places: at runtime it iterates over the argument, producing a tuple. In all cases we need to support `Expand[Ts]` as well for backwards compatibility with Python 3.9 and before.
The `Map[]` operator is, as I said, fairly clunky. In the past various other syntaxes have been proposed. In particular, @sixolet p <https://github.com/python/typing/issues/513>roposed a syntax that would allow defining `zip()` as follows: ``` def zip(*args: Iterable[Ts]) -> Iterator[Tuple[Ts, ...]]: ... ``` Compare this to what it would look using the current proposal: ``` def zip(*args: *Map[Iterable, Ts]) -> Iterator[Ts]): ... # Note that Iterator[Ts] is the same as Iterator[Tuple[*Ts]] ``` Sixolet's syntax made the iteration over the elements of Ts implicit, which is slightly shorter, and doesn't require "higher-order type functions" (is there an official name for that?), but also slightly more cryptic, and created yet another use for the ellipsis: `Tuple[Ts, ...]` is not quite analogous to `Tuple[T, ...]`, since the latter is *homogeneous* while the former is still heterogeneous. The new notation uses an explicit `Map[]` operator, which is similar to the choice we made in PEP 612 for `Concatenate[]`. (Speaking of this choice, we could drop the `*` prefix and rely purely on `Expand[]`, but that feels unnecessarily verbose, and we'll get most of the needed syntax for free with PEP 637, assuming it's accepted.)
All in all my recommendation for this PEP is: clean up the text based on the GDoc feedback, add `Union[*Ts]`, and submit to the Steering Council.
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/> _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: mrahtz@google.com
# Concatenation of multiple TypeVar tuples I suggest allowing a concatenated type (Tuple[*Ts, *Ts2]) when the *length* of at least one is unambiguously implied by other arguments. Copying over my comment from GDoc: TypeScript allows this for `curry` (`partial` in Python): ``` def partial(f: Callable[[*Ts, *Rs], T], *some_args: *Ts) -> Callable[[*Rs], T]: ... def foo(x: int, y: str, z: bool) -> int: ... # We know the length of Ts based on the number of arguments. partial(foo, 1) # => Callable[[str, bool], int] partial(foo, 1, "hello") # => Callable[[bool], int] # Note that this doesn't allow for keyword arguments, # but it's still a "partial" improvement over the current signature :) ``` Link: TypeScript's variadic tuple types have a detailed summary in https://github.com/microsoft/TypeScript/pull/39094 In cases I've seen, the length is usually inferable. In perverse cases where it is not, we would raise an error (same as TypeScript): ``` def foo(f: Callable[[], Tuple[*Ts, *Ts2]]) -> Ts: ... def bar() -> Tuple[int, str, bool]: ... foo(bar) # => Error: Unable to infer types of Ts and Ts2 when assigning `bar` to `f`. # The same goes for class Protocols or callable Protocols that resemble the above. # For example, a callable Protocol may be generic in its return type like `f`. ``` Things like `concat` would be straightforward: ``` def concat(t1: Tensor[*Ts], t2: Tensor[*Ts2]) -> Tensor[*Ts, *Ts2]: ... ``` Overall, this approach would keep concatenation intuitive while still being sound. On Wed, Dec 23, 2020 at 9:51 AM Matthew Rahtz via Typing-sig < typing-sig@python.org> wrote:
Ugh, sorry to flip-flop - it's only in finding a big enough chunk of time to really sit down and work on this again that things are becoming clearer...
I think allowing unexpanded type tuple variables would be a mistake
I disagree, past Matthew! :)
First, it makes the mental model really confusing. If we were to say that `Tuple[*Ts]` is valid, but `Ts` on its own isn't, then `Ts` is somehow a tuple of types that is nonetheless not an actual `Tuple`. We would be saying that `Ts` behaves like a parameterised tuple, but can only be used exactly like one in certain cases.
Relatedly, we do still have to allow unexpanded type variable tuples *sometimes* - e.g. `Map`:
```python Map[List, Ts] ```
If we *were* to disallow unexpanded type variables tuples in general, then the rule would have to be something like "Type variable tuples should always be used expanded, except in `Map`", which seems horribly kludgy.
a) un-Pythonic
If the issue is that there should only be one way to do it, then fine - we encourage users to use plain `Ts` in the first place, instead of `Tuple[*Ts]`.
b) undermines one of the reasons we thought of using the star operator here in the first place: to make it clear when the thing in question is a type variable *tuple* rather than just a plain type variable.
I think this is worth giving up for the sake of a more consistent mental model about what kind of a thing a type variable tuple is. Plus, we'll still be using the star in *most* cases.
A final argument in favour of allowing unexpanded type variables is that it saves on keystrokes and verbosity. `-> Ts` is nicer than `-> Tuple[*Ts]`. :)
So in summary, we'd write:
```python class Tensor(Generic[*Shape]): ... t: Tensor[Height, Width]
class Tensor2(Generic[Shape]): ... t2: Tensor[Tuple[Height, Width]]
class MultiTensor(Generic[Shape1, Shape]): ... mt: MultiTensor[Tuple[Time, Batch], Tuple[Height, Width]]
def args_to_tuples(*args: *Ts) -> Ts: ...
class Process: def __init__(target: Callable[[*Ts], Any], args: Ts): ... ```
On Wed, 23 Dec 2020 at 15:02, Matthew Rahtz <mrahtz@google.com> wrote:
P.P.S. Pradeep - did you say you had an example of where a class generic in multiple type variable tuples *would* be necessary?
On Wed, 23 Dec 2020 at 15:01, Matthew Rahtz <mrahtz@google.com> wrote:
P.S. If we did go for Option 3, we could still make classes generic in multiple type tuple variables by using the explicit syntax:
```python class C(Generic[Tuple[*Ts1], Tuple[*Ts2]): ...
c: C[Tuple[int, str], Tuple[float]] ```
(Aside: this was previously written in the PEP using 'unexpanded' type tuple variables:
```python class C(Generic[Ts1, Ts2]): ... ```
But having thought on it for a week, I think allowing unexpanded type tuple variables would be a mistake - it would mean there are two ways to write the following:
```python def identity(x: Tuple[*Ts]) -> Tuple[*Ts]: ... # could also be written as def identity(x: Ts) -> Ts: ... ```
This is a) un-Pythonic, and b) undermines one of the reasons we thought of using the star operator here in the first place: to make it clear when the thing in question is a type variable *tuple* rather than just a plain type variable.)
On Wed, 23 Dec 2020 at 14:33, Matthew Rahtz <mrahtz@google.com> wrote:
(Moving some discussion here from the doc)
# Concatenating type variable tuples with other types
In all cases we should also support extra single types before and after, e.g. Tuple[int, *Ts, str].
In cases where there's only a single type variable tuple, it should be fine to allow an *arbitrary* number of concrete types before and after, shouldn't it?
```python def foo(t: Tuple[int, str, *Ts, double]) -> Tuple[*Ts]: ... t: Tuple[int, str, float, bool, double] foo(t) # Return has type Tuple[float, bool] ```
# Multiple type variable tuples
Thinking out loud - let's get straight about all the places this could occur.
## Function arguments
Example 1: ```python def func(spam: Tuple[*Ts1, *Ts2]): ... spam: Tuple[int, str, bool] func(spam) ```
This wouldn't work: how would we decide which types were bound to `Ts1` and which were bound to `Ts2`? (Ignore the fact that type variables are only used once in the signature here.)
On the other hand, it *would* work if there were extra constraints - say, from other arguments whose type was unambiguous:
Example 2: ```python def func(ham: Tuple[*Ts1], spam: Tuple[*Ts1, *Ts2]): ... ham: Tuple[int, float] spam: Tuple[int, float, double, str] func(ham, spam) ```
**Conclusion: sometimes alright, sometimes not.**
## Function returns
Can this work?
Example 3: ```python def foo() -> Tuple[*Ts1, *Ts]: return 0, 0.0, '0' ```
On the face of it, we have the same problem. But in practice, we'd never encounter this example, because `Ts1` and `Ts2` would have had to occur somewhere else in the signature, which would have nailed them down:
Example 4: ```python def foo(ham: Tuple[*Ts1], spam: Tuple[*Ts2]) -> Tuple[*Ts1, *Ts2]: ... ham: Tuple[int, str] spam: Tuple[float, double] foo(ham, spam) # Inferred type is Tuple[int, str, float, double] ```
**Conclusion: always fine.**
## Classes
Example 5: ```python class C(Generic[*Ts1, *Ts2]): ... c: C[int, str, float] = C() ```
Same problem as function arguments. And this time, I don't think there's any way to add extra constraints to disambiguate.
**Conclusion: never alright.**
If, for some reason, we did want a class that was generic in multiple type tuple variables, the current proposal in the PEP is:
Example 6: ```python class C(Generic[Ts1, Ts2]): ... c: C[Tuple[int, str], Tuple[float]] = C() # Great! c: C[int, str, float] # Not allowed ```
---
OK, so the example that Pradeep suggested...
```python def partial(f: Callable[[*Ts, *Rs], T], *some_args: *Ts) -> Callable[[*Rs], T]: ... ```
...is similar to Example 2: the ``Callable`` is ambiguous on its own, but there's extra context in the rest of the signature which disambiguates it.
---
So overall, the three options I see are:
* Option 1: Disallow multiple expanded type variables tuples everywhere, for consistency and ease-of-understanding * Option 2: Only allow multiple expanded type variable tuples in contexts where it's *always* unambiguous - i.e. only in return types. * Option 3: Allow multiple expanded type variable tuples in general, but have the type checker produce an error when the types cannot be solved for.
Thoughts?
On Wed, 23 Dec 2020 at 11:02, Matthew Rahtz <mrahtz@google.com> wrote:
Thank you for sponsoring this, Guido, and for the thorough review!
I wonder why the proposal left out `Union[*Ts]`.
Ah, yes, great point. I'll add a section on that.
I'm not sure that `Tensor[T1, T2, ...]` is the be-all and end-all of tensor types (e.g. where would you put the data type of numpy arrays?) but maybe that can be handled by just adding one non-variadic type variable (a complete example would be nice though).
This has been on the back of my mind too. Adding a single additional non-variadic type variable is how I was imagining it would work too, though there are still some details to work out (e.g. ideally it should be optional so that people can choose what level of type verbosity they want to go with). I'll add a section trying to figure this out.
The other thing that's still unresolved is how we handle access to individual types - needed so that we can provide overloads of shape-manipulating operations. (I'm assuming that overloads are the way to go here, at least for the time being. In an ideal world we would be able to express the resulting shapes directly as a function of the arguments, but I don't think that'll be possible without fully dependent typing). My initial idea was to do this using "class overloads":
```python class Tensor(Generic[*Shape]): ...
@overload class Tensor(Generic[Axis1, Axis2]): def transpose(self) -> Tensor[Axis2, Axis1]: ...
@overload class Tensor(Generic[Axis1, Axis2, Axis3]): def transpose(self) -> Tensor[Axis3, Axis2, Axis1]: ... ```
But you're right in calling this out in the draft doc as non-trivial. It's also very verbose, requiring a whole separate class for each possible instantiation.
Instead, perhaps the following would suffice?
```python class Tensor(Generic[*Shape]):
@overload def transpose(self: Tensor[Axis1, Axis2]) -> Tensor[Axis2, Axis1]: ...
@overload def transpose(self: Tensor[Axis1, Axis2, Axis3]) -> Tensor[Axis3, Axis2, Axis1]: ... ```
This is similar to the following example, which already seems to type-check properly in mypy:
```python class C(Generic[T]):
@overload def f(self: C[int], x) -> int: return x
@overload def f(self: C[str], x) -> str: return x ```
I'd welcome other suggestions, though!
In any case, I'll continue cleaning up the doc as suggested, moving discussion of meatier issues to this thread for posterity, and post here once I think the doc is done.
On Tue, 22 Dec 2020 at 23:46, Guido van Rossum <guido@python.org> wrote:
I have read the proposed PEP about variadic generics (PEP 646) and I like it enough that I want to sponsor it and want to help getting it over the finish line (we have to get the Steering Council to understand enough of it that they'll delegate approval to me :-).
For reference, here's the PR that proposes to add PEP 646: https://github.com/python/peps/pull/1740 And here's the original Google Doc: https://docs.google.com/document/d/1oXWyAtnv0-pbyJud8H5wkpIk8aajbkX-leJ8JXsE...
A good review starts by briefly summarizing the proposal being reviewed, so here's the proposal in my own words.
**Motivation A:** We want to create generic types that take an arbitrary number of type parameters, like Tuple. For example, Tensors where each dimension is a "type". There is a demonstration of this without variadics, but it requires defining types `Tensor1[T1]`, `Tensor2[T1, T2]`, etc.: https://github.com/deepmind/tensor_annotations. We want just `Tensor[T1]`, `Tensor[T1, T2]`, etc., for any number of parameters.
**Motivation B:** The type of functions like map() and zip() cannot be expressed using the existing type system. The simplest example would be the type of ``` def foo(*args): return args a = foo(42, "abc") # Should have type Tuple[int, str] ```
**Proposal:** Introduce a new kind of type variable that can be instantiated with an arbitrary number of types, some new syntax, and a new type operator: ``` Ts = TypeVarTuple("Ts") # NEW T = TypeVar("T")
def f(*args: *Ts) -> Tuple[*Ts]: ... class C(Generic[*Ts]): ... Callable[[*Ts], T] Tuple[*Ts]
Map[SomeType, Ts] # SomeType is a generic of one parameter ``` In most cases the form `*Ts` may be preceded and/or followed by any number of non-variadic types, e.g., `Tuple[int, int, *Ts, str]`. In cases where it's unambiguous, multiple variadic type variables are also allowed, e.g., `Tuple[*Ts1, *Ts2]`. For older Python versions, `Expand[Ts]` would mean the same as `*Ts`.
So now let me go on with my (generally favorable) review. (I left many detailed editorial comments in the Google Doc -- I will not repeat those here.)
I like the proposal a lot, and I am glad that we now have (apparently) a working prototype in Pyre. This has been on our wish list since at least 2016 -- much early discussion happened in https://github.com/python/typing/issues/193 and at various meetings at PyCon and at the Bay Area Typing Meetups (links in the PEP). The proposed syntax has cycled through endless variations, and I am fine with the current proposal, even though it is still slightly clunky. There's also https://github.com/python/typing/issues/513, which is specifically about array types.
There are probably other motivating applications that the PEP doesn't mention, for example certain decorator types (I doubt that all of these are taken care by PEP 612, ParamSpec).
I wonder why the proposal left out `Union[*Ts]`. This would seem useful, e.g. to type this function: ``` def f(*args): return random.choice(args) ``` which could be typed naturally as follows: ``` def f(*args: *Ts) -> Union[*Ts]: return random.choice(args) ```
I'm not sure that `Tensor[T1, T2, ...]` is the be-all and end-all of tensor types (e.g. where would you put the data type of numpy arrays?) but maybe that can be handled by just adding one non-variadic type variable (a complete example would be nice though). There are also proposals for integer generics, which deserve their own PEP (presumably aiming at Python 3.11).
Eric Traut proposed an extension that would allow defining variadic subtypes of Sequence which behave similar to Tuple (where `Tuple[int, int]` is a subtype of `Tuple[int, ...]` which is a subtype of `Sequence[int]`), but I'm not sure we would need that a lot -- we could always add that later.
The introduction of a prefix `*` operator requires new syntax in a few cases. While `Callable[[*Ts], T]` is already valid (the parser interprets this as sequence unpacking), `Tuple[*Ts]` is not, and neither is `def f(*a: *Ts)`. For `Tuple[*Ts]` we can piggy-back on PEP 637 (keyword indexing, which adds this as well), but for the `def` example we'll need to add something new specifically for this PEP. I think that's fine -- we can give it runtime semantics that turns `*Ts` into `(*Ts,)`, which is similar to the other places: at runtime it iterates over the argument, producing a tuple. In all cases we need to support `Expand[Ts]` as well for backwards compatibility with Python 3.9 and before.
The `Map[]` operator is, as I said, fairly clunky. In the past various other syntaxes have been proposed. In particular, @sixolet p <https://github.com/python/typing/issues/513>roposed a syntax that would allow defining `zip()` as follows: ``` def zip(*args: Iterable[Ts]) -> Iterator[Tuple[Ts, ...]]: ... ``` Compare this to what it would look using the current proposal: ``` def zip(*args: *Map[Iterable, Ts]) -> Iterator[Ts]): ... # Note that Iterator[Ts] is the same as Iterator[Tuple[*Ts]] ``` Sixolet's syntax made the iteration over the elements of Ts implicit, which is slightly shorter, and doesn't require "higher-order type functions" (is there an official name for that?), but also slightly more cryptic, and created yet another use for the ellipsis: `Tuple[Ts, ...]` is not quite analogous to `Tuple[T, ...]`, since the latter is *homogeneous* while the former is still heterogeneous. The new notation uses an explicit `Map[]` operator, which is similar to the choice we made in PEP 612 for `Concatenate[]`. (Speaking of this choice, we could drop the `*` prefix and rely purely on `Expand[]`, but that feels unnecessarily verbose, and we'll get most of the needed syntax for free with PEP 637, assuming it's accepted.)
All in all my recommendation for this PEP is: clean up the text based on the GDoc feedback, add `Union[*Ts]`, and submit to the Steering Council.
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/> _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: mrahtz@google.com
_______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: gohanpra@gmail.com
-- S Pradeep Kumar
Matthew Rahtz wrote:
## Function arguments ... Conclusion: sometimes alright, sometimes not. ## Function returns ... Conclusion: always fine. ## Classes ... Conclusion: never alright.
While I agree with most of the analysis here, I think some of the conclusions can be further amended. With higher-order functions, one can easily move function returns into parameter position and function parameters into return position. So whether the concatenation is fine or not is not really tied to whether it occurs at parameter or return position. For example: ``` def higher_order() -> Callable[[Tuple[*Ts1, *Ts2]], None]: ... ``` This function put the concatenation at the return type, but it runs into the same problem as your example 1: we have no way to decide which variables are bound to Ts1 and which are bound to Ts2 inside the function. Similar trick can be played to flip the parameters/returns in your example 2, 3, and 4. As a result, I think it would be really hard to implement your Option 2 since there is no easy way of telling when it's always unambiguous by only looking at the parameter/return positions. Regarding concatenation on classes, I do think there's a way to add extra contexts via method calls: ``` class C(Generic[*Ts1, *Ts2]): def __init__(self, t0: Tuple[*Ts1], t1: Tuple[*Ts2]) -> None: ... def test() -> None: reveal_type(C((1, "a"), ("b", 2))) # should be C[int, str, str, int] ``` It is true that for class `C` you will never be able to spell out any explicit type annotations (e.g. one can't just write `c: C[int, str, int]`since there's no way to disambiguate Ts1 and Ts2), but other than that things shouldn't work too differently from variadic functions. That said, I suspect allowing this is going to make type inference on the caller side a bit trickier, so no strong opinions on banning it as well. Another possibility is to make this an optional feature so each type checker can choose to implement it or not. Overall I would champion either your Option 1 and Option 3. Slightly in favor of Option 3 for more flexibility.
If, for some reason, we did want a class that was generic in multiple type tuple variables, the current proposal in the PEP is: Example 6: class C(Generic[Ts1, Ts2]): ... c: C[Tuple[int, str], Tuple[float]] = C() # Great! c: C[int, str, float] # Not allowed
OK, so the example that Pradeep suggested... def partial(f: Callable[[*Ts, *Rs], T], *some_args: *Ts) -> Callable[[*Rs], T]: ...
...is similar to Example 2: the Callable is ambiguous on its own, but there's extra context in the rest of the signature which disambiguates it.
So overall, the three options I see are:
Option 1: Disallow multiple expanded type variables tuples everywhere, for consistency and ease-of-understanding Option 2: Only allow multiple expanded type variable tuples in contexts where it's always unambiguous - i.e. only in return types. Option 3: Allow multiple expanded type variable tuples in general, but have the type checker produce an error when the types cannot be solved for.
Thoughts?
On Wed, Dec 23, 2020 at 3:02 AM Matthew Rahtz via Typing-sig < typing-sig@python.org> wrote:
Thank you for sponsoring this, Guido, and for the thorough review!
You're welcome.
I wonder why the proposal left out `Union[*Ts]`.
Ah, yes, great point. I'll add a section on that.
Looking forward to that.
I'm not sure that `Tensor[T1, T2, ...]` is the be-all and end-all of tensor types (e.g. where would you put the data type of numpy arrays?) but maybe that can be handled by just adding one non-variadic type variable (a complete example would be nice though).
This has been on the back of my mind too. Adding a single additional non-variadic type variable is how I was imagining it would work too, though there are still some details to work out (e.g. ideally it should be optional so that people can choose what level of type verbosity they want to go with). I'll add a section trying to figure this out.
Making it optional would be complicated, given that the rest is variadic. Maybe a future PEP could use keyword indexing (PEP 637) but for now that's not an option yet.
The other thing that's still unresolved is how we handle access to individual types - needed so that we can provide overloads of shape-manipulating operations. (I'm assuming that overloads are the way to go here, at least for the time being. In an ideal world we would be able to express the resulting shapes directly as a function of the arguments, but I don't think that'll be possible without fully dependent typing).
Actually, Alfonso Castaño has quietly been working on a way to express this. I believe you know him?
My initial idea was to do this using "class overloads":
```python class Tensor(Generic[*Shape]): ...
@overload class Tensor(Generic[Axis1, Axis2]): def transpose(self) -> Tensor[Axis2, Axis1]: ...
@overload class Tensor(Generic[Axis1, Axis2, Axis3]): def transpose(self) -> Tensor[Axis3, Axis2, Axis1]: ... ```
But you're right in calling this out in the draft doc as non-trivial. It's also very verbose, requiring a whole separate class for each possible instantiation.
I think it would be best to drop that idea from this PEP.
Instead, perhaps the following would suffice?
```python class Tensor(Generic[*Shape]):
@overload def transpose(self: Tensor[Axis1, Axis2]) -> Tensor[Axis2, Axis1]: ...
@overload def transpose(self: Tensor[Axis1, Axis2, Axis3]) -> Tensor[Axis3, Axis2, Axis1]: ... ```
This is similar to the following example, which already seems to type-check properly in mypy:
```python class C(Generic[T]):
@overload def f(self: C[int], x) -> int: return x
@overload def f(self: C[str], x) -> str: return x ```
Yes, that sounds manageable.
I'd welcome other suggestions, though!
In any case, I'll continue cleaning up the doc as suggested, moving discussion of meatier issues to this thread for posterity, and post here once I think the doc is done.
Let me know how I can help. I am subscribed to the peps repo so when you send a PR there it will automatically notify me. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
As Guido said, while working on type arithmetic I also explored some ideas about how to deal with the exact length of a variadic, as well as dimension specific operations. The idea does not need any sort of dependent typing, just to take advantage of literal types. I did not comment about this earlier since I thought that this PEP should be as minimal as possible. The main idea is that, since we may need a custom operator on variadics for each specific use case, it would be better to have something more abstract. https://docs.google.com/document/d/1IlByrIjZPPxTa_1ZWHLZDaQrxdGrAbL71plzmXi0... TLDR: def f(x : Tensor[Ts], y : |Ts|): ... or def f(x : Tensor[Ts[N]], y : N): ... My initial motivation was to leverage the length operator on variadics (|_|) that I wanted to introduce to also constraint the length of the variadics, so that not more operators would be needed for that task (apart from that operator of course). However, nowadays I think that there is no need to rely on the length operator for that and we could use one of the alternative syntaxes. Regarding the current PEP, congratulations for the progress so far! However, I might have missed something but: where is specified the constraints that a variadic can have (variance, bound...)? And the subtyping rules for variadics?
I suggest allowing a concatenated type (Tuple[*Ts, *Ts2]) when the *length* of at least one is unambiguously implied by other arguments.
I think that's reasonable, if no one has any objections on the grounds that we make things too hard for static checkers. I've tentatively reworked the section in question to reflect this.
Regarding concatenation on classes, I do think there's a way to add extra contexts via method calls:
Oh, huh, interesting. But not being able to actually write down unambiguously exact what type it is seems like a major downside. Having to consult the constructor arguments to figure out the type also seems really complicated. At the moment I'm leaning towards not allowing it - unless anyone has any suggestions of a concrete use case for `class C(Generic[*Ts1, *Ts2])`?
The idea does not need any sort of dependent typing, just to take advantage of literal types.
Oh, super cool! Looking forward to seeing more of this :)
However, I might have missed something but: where is specified the constraints that a variadic can have (variance, bound...)?
Ah, yeah - variance and bound are currently not supported. I think figuring out how they should work is going to be tricky, so best left for a future PEP, once we have more experience with using variadic generics in practice. I've added a short section stating this explicitly.
And the subtyping rules for variadics?
Do you mean something like this? ```python DType = TypeVar('DType') Shape = TypeVarTuple('Shape') class Array(Generic[DType, *Shape]): ... class Float32Array(Array[np.float32, *Shape]): ... ``` If so, hopefully this should be clear from the new section on 'An Ideal Array Type'. If you mean something different, could you clarify? --- Latest changes at https://github.com/python/peps/pull/1751. I think this incorporates most of the feedback, but let me know if I've missed anything. One significant change I've (tentatively) made is renaming `Expand` to `Unpack`, to reflect the terminology we use with regular tuples. I'm surprised no one else has suggested this, so I might be missing something - are there any arguments against calling what we're doing 'unpacking'? On Mon, 28 Dec 2020 at 19:27, Alfonso L. Castaño <alfonsoluis.castanom@um.es> wrote:
As Guido said, while working on type arithmetic I also explored some ideas about how to deal with the exact length of a variadic, as well as dimension specific operations. The idea does not need any sort of dependent typing, just to take advantage of literal types. I did not comment about this earlier since I thought that this PEP should be as minimal as possible. The main idea is that, since we may need a custom operator on variadics for each specific use case, it would be better to have something more abstract.
https://docs.google.com/document/d/1IlByrIjZPPxTa_1ZWHLZDaQrxdGrAbL71plzmXi0...
TLDR: def f(x : Tensor[Ts], y : |Ts|): ... or def f(x : Tensor[Ts[N]], y : N): ...
My initial motivation was to leverage the length operator on variadics (|_|) that I wanted to introduce to also constraint the length of the variadics, so that not more operators would be needed for that task (apart from that operator of course). However, nowadays I think that there is no need to rely on the length operator for that and we could use one of the alternative syntaxes.
Regarding the current PEP, congratulations for the progress so far! However, I might have missed something but: where is specified the constraints that a variadic can have (variance, bound...)? And the subtyping rules for variadics? _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: mrahtz@google.com
@Alfonso: That seems like an interesting and feasible idea! Overall, I'd like to defer this for now since it would be fully backward-compatible. It probably belongs in the future type arithmetic PEP. We could use the `Length` operator from type arithmetic, where Length[Ts] resolves to a Literal. For example, if Ts = `Tuple[int, str]`, `Length[Ts] == Literal[2]`. Concrete example (translating your doc example to the PEP syntax): ```python def mean(t: Tensor[*Ts, T, *Ts2], dim: Length[Ts]) -> Tensor[*Ts, *Ts2]: ... L = Literal t: Tensor[L[3], L[4], L[5]] mean(t, dim=1) # => Tensor[L[3], L[5]] ``` We could also look at alternative syntax in the future, but would like to punt on this for now. @Matthew:
At the moment I'm leaning towards not allowing it - unless anyone has any suggestions of a concrete use case for `class C(Generic[*Ts1, *Ts2])`?
Yeah, I strongly prefer forbidding such a syntax since an explicit declaration `C[int, str]` is inherently ambiguous.
One significant change I've (tentatively) made is renaming `Expand` to `Unpack`, to reflect the terminology we use with regular tuples. I'm surprised no one else has suggested this, so I might be missing something - are there any arguments against calling what we're doing 'unpacking'?
I think that's reasonable. On Tue, Dec 29, 2020 at 7:26 AM Matthew Rahtz via Typing-sig < typing-sig@python.org> wrote:
I suggest allowing a concatenated type (Tuple[*Ts, *Ts2]) when the *length* of at least one is unambiguously implied by other arguments.
I think that's reasonable, if no one has any objections on the grounds that we make things too hard for static checkers. I've tentatively reworked the section in question to reflect this.
Regarding concatenation on classes, I do think there's a way to add extra contexts via method calls:
Oh, huh, interesting. But not being able to actually write down unambiguously exact what type it is seems like a major downside. Having to consult the constructor arguments to figure out the type also seems really complicated. At the moment I'm leaning towards not allowing it - unless anyone has any suggestions of a concrete use case for `class C(Generic[*Ts1, *Ts2])`?
The idea does not need any sort of dependent typing, just to take advantage of literal types.
Oh, super cool! Looking forward to seeing more of this :)
However, I might have missed something but: where is specified the constraints that a variadic can have (variance, bound...)?
Ah, yeah - variance and bound are currently not supported. I think figuring out how they should work is going to be tricky, so best left for a future PEP, once we have more experience with using variadic generics in practice. I've added a short section stating this explicitly.
And the subtyping rules for variadics?
Do you mean something like this?
```python DType = TypeVar('DType') Shape = TypeVarTuple('Shape')
class Array(Generic[DType, *Shape]): ...
class Float32Array(Array[np.float32, *Shape]): ... ```
If so, hopefully this should be clear from the new section on 'An Ideal Array Type'. If you mean something different, could you clarify?
---
Latest changes at https://github.com/python/peps/pull/1751. I think this incorporates most of the feedback, but let me know if I've missed anything.
One significant change I've (tentatively) made is renaming `Expand` to `Unpack`, to reflect the terminology we use with regular tuples. I'm surprised no one else has suggested this, so I might be missing something - are there any arguments against calling what we're doing 'unpacking'?
On Mon, 28 Dec 2020 at 19:27, Alfonso L. Castaño < alfonsoluis.castanom@um.es> wrote:
As Guido said, while working on type arithmetic I also explored some ideas about how to deal with the exact length of a variadic, as well as dimension specific operations. The idea does not need any sort of dependent typing, just to take advantage of literal types. I did not comment about this earlier since I thought that this PEP should be as minimal as possible. The main idea is that, since we may need a custom operator on variadics for each specific use case, it would be better to have something more abstract.
https://docs.google.com/document/d/1IlByrIjZPPxTa_1ZWHLZDaQrxdGrAbL71plzmXi0...
TLDR: def f(x : Tensor[Ts], y : |Ts|): ... or def f(x : Tensor[Ts[N]], y : N): ...
My initial motivation was to leverage the length operator on variadics (|_|) that I wanted to introduce to also constraint the length of the variadics, so that not more operators would be needed for that task (apart from that operator of course). However, nowadays I think that there is no need to rely on the length operator for that and we could use one of the alternative syntaxes.
Regarding the current PEP, congratulations for the progress so far! However, I might have missed something but: where is specified the constraints that a variadic can have (variance, bound...)? And the subtyping rules for variadics? _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: mrahtz@google.com
_______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: gohanpra@gmail.com
-- S Pradeep Kumar
On 12/29/20 12:24 PM, S Pradeep Kumar wrote:
@Alfonso:
That seems like an interesting and feasible idea!
Overall, I'd like to defer this for now since it would be fully backward-compatible. It probably belongs in the future type arithmetic PEP.
Agreed. The existing PEP is already fairly complex. On 12/29/20 12:24 PM, S Pradeep Kumar wrote:
One significant change I've (tentatively) made is renaming `Expand` to `Unpack`, to reflect the terminology we use with regular tuples. I'm surprised no one else has suggested this, so I might be missing something - are there any arguments against calling what we're doing 'unpacking'?
I think that's reasonable.
It looks like there is precedent for use of the term "unpacking" in the existing Python documentation: - "iterable unpacking", when talking about *args > https://docs.python.org/3/reference/expressions.html#expression-lists - "unpacking argument lists", when talking about *args > https://docs.python.org/3/tutorial/controlflow.html#unpacking-argument-lists - "dictionary unpacking" when talking about **kwargs > https://docs.python.org/3/reference/expressions.html#dictionary-displays So even though I personally still like Expand, I agree that Unpack would probably be more consistent with existing documentation (and search engine keywords). Best, -- David Foster | Seattle, WA, USA Contributor to TypedDict support for mypy
**Warning: long stream-of-consciousness email. Tl; dr: `TypeVar(bound=Tuple)` is extremely elegant, but might limit our options later on because of how the variance of `Tuple` works. I'm still leaning towards `TypeVarTuple`, but only just.** Guido had a really interesting suggestion in the last tensor typing meeting: what if, instead of creating a new constructor `TypeVarTuple`, we just did `TypeVar(bound=Tuple)`? And reordering some examples in the PEP, it's just fully hit me that this would actually make a whole bunch of sense. The simplest example I could come up with to introduce usage of `TypeVarTuple` was: ```python def identity(x: Ts) -> Ts: ... x: Tuple[int, str] y = identity(x) ``` But I was debating with myself: "OK, but we're going to have to be explicit about this being an example for illustrative purposes only - because otherwise, readers are going to be wondering, 'Why not just use a regular `TypeVar` for `Ts`?'" And that's the whole point! `Ts` probably _should_ just be a regular `TypeVar`. The only thing special about the example is that we're assuming it's a `TypeVar` that definitely does contain some other types, such that we can potentially use `Unpack` or `Map` later on. Looking back through the conversation we had about this last time, I think the main doubt we had about whether reusing a regular `TypeVar` was that it could be confusing if the variadic nature of a particular instance was only _implied_ rather than marked specifically. But if we _are_ being explicit by using `bound=Tuple`, I personally feel a lot better about it. Thinking out loud: one consideration is how this would interact with other arguments to `TypeVar`. For example, suppose we wanted to set up `Tensor` so that things worked like this: ```python class Tensor(Generic[*Shape]): ... class Batch: pass class TrainBatch(Batch): pass class TestBatch(Batch): pass class Time: pass def only_accepts_batch(t: Tensor[Batch]): ... t1: Tensor[TrainBatch] only_accepts_batch(t1) # Valid t2: Tensor[Time] only_accepts_batch(t2) # Error ``` This would work if `Shape` was somehow set up to be covariant in a way such that since `TrainBatch` is a subclass of `Batch`, `Tensor[TrainBatch]` would be considered a subclass of `Tensor[Batch]`). How would this work if we reused `TypeVar`? It seems like it wouldn't work: if we did `Shape = TypeVar('Shape', bound=Tuple, covariant=True)`...well, the question is: is `Tuple[TrainBatch]` a subclass of `Tuple[Batch]`? That is, is `Tuple` covariant? Hmm, I actually can't find solid information on this in PEP 483 or 484. https://github.com/python/typing/issues/2 suggests the answer is 'yes', but I guess this is only true in the `Tuple[SomeType, ...]` form; a `Tuple[Child]` doesn't seem like it should be automatically be a subclass of `Tuple[Parent]`. We _could_ set this up so that, if a variadic type variable is unpacked, the types we check aren't `Tuple[Batch]` and `Tuple[TrainBatch]` but `Batch` and `TrainBatch` directly. But that would create an inconsistency: this wouldn't be possible if, for some reason, the user wishes to use the variadic type variable without unpacking - a use-case that feels like it should have equal rights to using a variadic type variable _with_ unpacking. Another option would be to change the behaviour of `covariant=True` and `contravariant=True` when `bound=Tuple`. That seems less than ideal; it might not be backwards-compatible. A third option would be to introduce an extra argument to `TypeVar` which explicitly changed the behaviour - e.g. `tuplevariance=True`. But it would have to only be valid with `bound=Tuple`, so in that case we may as well just go back to `TypeVar('Shape', tuple=True)`. So far this was all about variance. But also, what about `bound`? What if we wanted to do: ```python class Batch: pass BatchShape = TypeVar('BatchShape', bound=Tuple[Batch]) class BatchTensor(Generic[BatchShape]): ... t1: BatchTensor[Batch] # Valid t2: Batchtensor[Time] # Error ``` I guess this one would also hinge on whether `Tuple[TrainBatch]` were considered a subclass of `Tuple[Batch]`. Hmm. To zoom out, though: one the other hand, we could also argue, "Let's not tie ourselves in knots about hypothetical future features. Let's choose the approach in the present which seems simplest and most elegant. Let's not overcomplicate the solution for the sake of all the things we might hypothetically want to do in the future." I guess the crux of that debate would be: how likely is it that the features I've sketched above are going to be ones we want? I'm not sure about that yet... So far I'm still leaning slightly towards creating a new constructor, in order to leave our options open later on. But I do feel mighty conflicted - the elegance of `TypeVar(bound=Tuple)` is undeniable. I'll mull this over a bit more. Thoughts? On Fri, 1 Jan 2021 at 15:58, David Foster <davidfstr@gmail.com> wrote:
On 12/29/20 12:24 PM, S Pradeep Kumar wrote:
@Alfonso:
That seems like an interesting and feasible idea!
Overall, I'd like to defer this for now since it would be fully backward-compatible. It probably belongs in the future type arithmetic PEP.
Agreed. The existing PEP is already fairly complex.
On 12/29/20 12:24 PM, S Pradeep Kumar wrote:
One significant change I've (tentatively) made is renaming `Expand` to `Unpack`, to reflect the terminology we use with regular tuples. I'm surprised no one else has suggested this, so I might be missing something - are there any arguments against calling what we're doing 'unpacking'?
I think that's reasonable.
It looks like there is precedent for use of the term "unpacking" in the existing Python documentation:
- "iterable unpacking", when talking about *args > https://docs.python.org/3/reference/expressions.html#expression-lists - "unpacking argument lists", when talking about *args >
https://docs.python.org/3/tutorial/controlflow.html#unpacking-argument-lists - "dictionary unpacking" when talking about **kwargs > https://docs.python.org/3/reference/expressions.html#dictionary-displays
So even though I personally still like Expand, I agree that Unpack would probably be more consistent with existing documentation (and search engine keywords).
Best, -- David Foster | Seattle, WA, USA Contributor to TypedDict support for mypy _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: mrahtz@google.com
On 1/12/21 1:00 PM, Matthew Rahtz via Typing-sig wrote:
Guido had a really interesting suggestion in the last tensor typing meeting: what if, instead of creating a new constructor `TypeVarTuple`, we just did `TypeVar(bound=Tuple)`?
If I understand correctly the assertion is that: TypeVarTuple is effectively the same as: TypeVar(bound=Tuple[Type, ...]) which is a bit more specific than: TypeVar(bound=Tuple) (which is the phrasing you used). If in fact the shorter alternative form "TypeVar(bound=Tuple)" is incorrect and only the longer form "TypeVar(bound=Tuple[Type, ...])" is correct, I'd advocate to continuing using the existing proposed syntax "TypeVarTuple" over the (much) longer form. On 1/12/21 1:00 PM, Matthew Rahtz via Typing-sig wrote:
[...] well, the question is: is `Tuple[TrainBatch]` a subclass of `Tuple[Batch]`? That is, is `Tuple` covariant?
The answer here is also not obvious to me. (I'm pretty sure it *is* covariant, but only after remembering that [non-obvious] rule-of-thumb that "immutable usually means covariant, vs "mutable usually means invariant".) Yet my understanding is that this (non-obvious) answer is critical to whether a syntax like "TypeVar(bound=Tuple...)" makes intuitive sense to the reader. Depending on a non-obvious answer seems risky. On 1/12/21 1:00 PM, Matthew Rahtz via Typing-sig wrote:
To zoom out, though: one the other hand, we could also argue, "Let's not tie ourselves in knots about hypothetical future features. Let's choose the approach in the present which seems simplest and most elegant. Let's not overcomplicate the solution for the sake of all the things we might hypothetically want to do in the future."
I generally like this principle. On 1/12/21 1:00 PM, Matthew Rahtz via Typing-sig wrote:
So far I'm still leaning slightly towards creating a new constructor, in order to leave our options open later on. But I do feel mighty conflicted - the elegance of `TypeVar(bound=Tuple)` is undeniable. I'll mull this over a bit more.
Long story short, I'm generally in favor of staying with "TypeVarTuple". Best, -- David Foster | Seattle, WA, USA Contributor to TypedDict support for mypy
On Sat, Jan 16, 2021 at 8:39 PM David Foster <davidfstr@gmail.com> wrote:
On 1/12/21 1:00 PM, Matthew Rahtz via Typing-sig wrote:
Guido had a really interesting suggestion in the last tensor typing meeting: what if, instead of creating a new constructor `TypeVarTuple`, we just did `TypeVar(bound=Tuple)`?
If I understand correctly the assertion is that: TypeVarTuple is effectively the same as: TypeVar(bound=Tuple[Type, ...]) which is a bit more specific than: TypeVar(bound=Tuple) (which is the phrasing you used).
I don't think that's correct though. The bound is not a tuple of types -- since that would imply that the values described by the typevar would themselves be types. Even without generics, we can write things like ``` A = Tuple[int, int] B = Tuple[str, str, str] ``` If we generalize this to tuples of any length and any type we get `Tuple[object, ...]`, or perhaps `Tuple[Any, ...]`, but definitely not `Tuple[Type, ...]`. Usually `Type` is only needed when we are manipulating class objects. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
Would it make sense to move away from making `Tuple` mean inhomogeneous in Python? For example lists can be inhomogeneous, except in typed Python. my_list = [1, 'a', None] reveal_type(my_list) # builtins.list[builtins.object*] my_tuple = (1, 'a', None) reveal_type(my_tuple) # Tuple[builtins.int, builtins.str, None] Then when I see `args_to_lists` in PEP 646 I'm wondering how I'm meant to build this tuple so a type checker doesn't error. Currently both map and list comprehensions error. https://www.python.org/dev/peps/pep-0646/#map def args_to_lists(args: Tuple[int, str]) -> Tuple[List[int], List[str]]: return tuple(map(lambda a: [a], args)) # Incompatible return value type (got "Tuple[List[object], ...]", expected "Tuple[List[int], List[str]]") def args_to_lists(args: Tuple[int, str]) -> Tuple[List[int], List[str]]: return tuple([[a] for a in args]) # Incompatible return value type (got "Tuple[List[object], ...]", expected "Tuple[List[int], List[str]]") This leaves me with two questions: - What should `map`'s type be to be able to match `Map`? - Is this going to promote calling `tuple` just to silence the type checker? Could adding an `Inhomogeneous` / `Heterogeneous` 'type', in the future, be a sensible solution? my_list = [1, 'a', None] reveal_type(my_list) # builtins.list[typing.Inhomogeneous[builtins.int, builtins.str, None]] my_tuple = (1, 'a', None) reveal_type(my_tuple) # Tuple[builtins.int, builtins.str, None] # This could be sugar for # Tuple[Inhomogeneous[int, str, None]] def args_to_lists(args: Tuple[int, str]) -> Iterator[Inhomogeneous[List[int], List[str]]]: return map(lambda a: [a], args) def args_to_lists(*args: *Ts) -> Iterator[Map[List, Ts]]: return map(lambda a: [a], args) def args_to_lists(args: Tuple[int, str]) -> List[Inhomogeneous[List[int], List[str]]]: return [[a] for a in args] def args_to_lists(*args: *Ts) -> List[Map[List, Ts]]: return [[a] for a in args] def args_to_lists(*args: *Ts) -> Map[List, Ts]: # Could assume `Tuple[Map[...]]` if no type is specified. return tuple([[a] for a in args]) To my naive eyes this fits and also shows how `TypeVar(bound=Tuple)` may not make sense in the future. Since would `List[Ts]]` mean `List[List[int], List[str]]` or `List[Tuple[List[int], List[str]]]`; it'd be strange to mean the former, but then how could you get the former? However suppose we decide to add `Inhomogeneous` in the future, what would stop us from just switching over to using `TypeVar(bound=Inhomogeneous)`? Additionally will the name `TypeVarTuple` lead to people conflating inhomogeneity with tuples? At this time it makes sense because tuple is the only inhomogeneous type in Python. But will this always be the case? If not would it make sense to add a `TypeVarX` for the new type(s), `X`? Or will we only have `TypeVarTuple` for all inhomogeneous types? Note: To be clear I don't think `Inhomogeneous` should be added to PEP 646. It would add a lot of complexity around mutability. How would `list.remove` work? Should `list.remove` result in a type error? How could you specify which methods work with `Inhomogeneous`, etc. On 12/01/2021 21:00, Matthew Rahtz via Typing-sig wrote:
**Warning: long stream-of-consciousness email. Tl; dr: `TypeVar(bound=Tuple)` is extremely elegant, but might limit our options later on because of how the variance of `Tuple` works. I'm still leaning towards `TypeVarTuple`, but only just.**
Guido had a really interesting suggestion in the last tensor typing meeting: what if, instead of creating a new constructor `TypeVarTuple`, we just did `TypeVar(bound=Tuple)`?
And reordering some examples in the PEP, it's just fully hit me that this would actually make a whole bunch of sense. The simplest example I could come up with to introduce usage of `TypeVarTuple` was:
```python def identity(x: Ts) -> Ts: ... x: Tuple[int, str] y = identity(x) ```
But I was debating with myself: "OK, but we're going to have to be explicit about this being an example for illustrative purposes only - because otherwise, readers are going to be wondering, 'Why not just use a regular `TypeVar` for `Ts`?'"
And that's the whole point! `Ts` probably _should_ just be a regular `TypeVar`. The only thing special about the example is that we're assuming it's a `TypeVar` that definitely does contain some other types, such that we can potentially use `Unpack` or `Map` later on.
Looking back through the conversation we had about this last time, I think the main doubt we had about whether reusing a regular `TypeVar` was that it could be confusing if the variadic nature of a particular instance was only _implied_ rather than marked specifically. But if we _are_ being explicit by using `bound=Tuple`, I personally feel a lot better about it.
Thinking out loud: one consideration is how this would interact with other arguments to `TypeVar`. For example, suppose we wanted to set up `Tensor` so that things worked like this:
```python class Tensor(Generic[*Shape]): ...
class Batch: pass class TrainBatch(Batch): pass class TestBatch(Batch): pass class Time: pass
def only_accepts_batch(t: Tensor[Batch]): ...
t1: Tensor[TrainBatch] only_accepts_batch(t1) # Valid t2: Tensor[Time] only_accepts_batch(t2) # Error ```
This would work if `Shape` was somehow set up to be covariant in a way such that since `TrainBatch` is a subclass of `Batch`, `Tensor[TrainBatch]` would be considered a subclass of `Tensor[Batch]`).
How would this work if we reused `TypeVar`? It seems like it wouldn't work: if we did `Shape = TypeVar('Shape', bound=Tuple, covariant=True)`...well, the question is: is `Tuple[TrainBatch]` a subclass of `Tuple[Batch]`? That is, is `Tuple` covariant?
Hmm, I actually can't find solid information on this in PEP 483 or 484. https://github.com/python/typing/issues/2 <https://github.com/python/typing/issues/2> suggests the answer is 'yes', but I guess this is only true in the `Tuple[SomeType, ...]` form; a `Tuple[Child]` doesn't seem like it should be automatically be a subclass of `Tuple[Parent]`.
We _could_ set this up so that, if a variadic type variable is unpacked, the types we check aren't `Tuple[Batch]` and `Tuple[TrainBatch]` but `Batch` and `TrainBatch` directly. But that would create an inconsistency: this wouldn't be possible if, for some reason, the user wishes to use the variadic type variable without unpacking - a use-case that feels like it should have equal rights to using a variadic type variable _with_ unpacking.
Another option would be to change the behaviour of `covariant=True` and `contravariant=True` when `bound=Tuple`. That seems less than ideal; it might not be backwards-compatible.
A third option would be to introduce an extra argument to `TypeVar` which explicitly changed the behaviour - e.g. `tuplevariance=True`. But it would have to only be valid with `bound=Tuple`, so in that case we may as well just go back to `TypeVar('Shape', tuple=True)`.
So far this was all about variance. But also, what about `bound`? What if we wanted to do:
```python class Batch: pass
BatchShape = TypeVar('BatchShape', bound=Tuple[Batch])
class BatchTensor(Generic[BatchShape]): ...
t1: BatchTensor[Batch] # Valid t2: Batchtensor[Time] # Error ```
I guess this one would also hinge on whether `Tuple[TrainBatch]` were considered a subclass of `Tuple[Batch]`. Hmm.
To zoom out, though: one the other hand, we could also argue, "Let's not tie ourselves in knots about hypothetical future features. Let's choose the approach in the present which seems simplest and most elegant. Let's not overcomplicate the solution for the sake of all the things we might hypothetically want to do in the future."
I guess the crux of that debate would be: how likely is it that the features I've sketched above are going to be ones we want? I'm not sure about that yet...
So far I'm still leaning slightly towards creating a new constructor, in order to leave our options open later on. But I do feel mighty conflicted - the elegance of `TypeVar(bound=Tuple)` is undeniable. I'll mull this over a bit more.
Thoughts?
On Fri, 1 Jan 2021 at 15:58, David Foster <davidfstr@gmail.com <mailto:davidfstr@gmail.com>> wrote:
On 12/29/20 12:24 PM, S Pradeep Kumar wrote: > @Alfonso: > > That seems like an interesting and feasible idea! > > Overall, I'd like to defer this for now since it would be fully > backward-compatible. It probably belongs in the future type arithmetic PEP.
Agreed. The existing PEP is already fairly complex.
On 12/29/20 12:24 PM, S Pradeep Kumar wrote: > > One significant change I've (tentatively) made is renaming `Expand` > to `Unpack`, to reflect the terminology we use with regular tuples. I'm > surprised no one else has suggested this, so I might be missing > something - are there any arguments against calling what we're doing > 'unpacking'? > > I think that's reasonable.
It looks like there is precedent for use of the term "unpacking" in the existing Python documentation:
- "iterable unpacking", when talking about *args > https://docs.python.org/3/reference/expressions.html#expression-lists <https://docs.python.org/3/reference/expressions.html#expression-lists> - "unpacking argument lists", when talking about *args > https://docs.python.org/3/tutorial/controlflow.html#unpacking-argument-lists <https://docs.python.org/3/tutorial/controlflow.html#unpacking-argument-lists> - "dictionary unpacking" when talking about **kwargs > https://docs.python.org/3/reference/expressions.html#dictionary-displays <https://docs.python.org/3/reference/expressions.html#dictionary-displays>
So even though I personally still like Expand, I agree that Unpack would probably be more consistent with existing documentation (and search engine keywords).
Best, -- David Foster | Seattle, WA, USA Contributor to TypedDict support for mypy _______________________________________________ Typing-sig mailing list -- typing-sig@python.org <mailto:typing-sig@python.org> To unsubscribe send an email to typing-sig-leave@python.org <mailto:typing-sig-leave@python.org> https://mail.python.org/mailman3/lists/typing-sig.python.org/ <https://mail.python.org/mailman3/lists/typing-sig.python.org/> Member address: mrahtz@google.com <mailto:mrahtz@google.com>
_______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: peilonrayz@gmail.com
On Tue, Jan 12, 2021 at 1:00 PM Matthew Rahtz via Typing-sig < typing-sig@python.org> wrote:
[...] That is, is `Tuple` covariant?
Hmm, I actually can't find solid information on this in PEP 483 or 484. https://github.com/python/typing/issues/2 suggests the answer is 'yes', but I guess this is only true in the `Tuple[SomeType, ...]` form; a `Tuple[Child]` doesn't seem like it should be automatically be a subclass of `Tuple[Parent]`.
To me, PEP 483 answers this clearly (if you remember that the dots here are *not* the ellipsis token but just mean "and so on through" :-):
`Tuple[u1, u2, ..., um]` is a subtype of `Tuple[t1, t2, ..., tn]` if they have the same length `n==m` and each `ui` is a subtype of `ti`.
I don't think we should consider any of the other suggestions you offered related to variance. If you're at all sensitive to Eric's suggestion of only supporting variadics when used with the `*Ts` notation (or the `Expand[Ts]` equivalent) then I think we have a very solid case for using `TypeVar(bound=Tuple)`. Another argument is that TypeScript does all the same things without explicitly needing to mark the type variables involved as being special -- they just write things like ``` type Foo<T extends unknown[]> = [string, ...T, number]; ``` which to me feels like the moral equivalent to ``` T = TypeVar("T", bound=Tuple) Foo = Tuple[str, *T, float] ``` See e.g. TypeScript: Variadic Tuple Types Preview (fettblog.eu) <https://fettblog.eu/variadic-tuple-types-preview/> -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
I recommend against using `bound=Tuple` for the following reasons: 1. If we drop support for packed (non-unpacked?) usage, it's entirely unnecessary and is just extra verbosity that provides no value. 2. I think it will confuse many users because it means that the `bound` has a different semantic meaning if the TypeVar is used in both a variadic and non-variadic context. 3. It eliminates the possibility of using `bound` for variadic TypeVars in the future. I think there are real use cases that we will want to enable here. For example, a class that accepts an arbitrary number of generic types where all of those types needs to be sized. This could be expressed as: ```python _T = TypeVar("_T", bound=Sized) class MyClass(Generic[*_T]): ... ``` 4. Python developers are still getting used to typing and generics, and I've found that many of them forget to (or don't understand that they need to) supply type arguments when using a generic type within a type expression. For that reason, Pyright emits a warning (or an error in "strict" mode) if type arguments are not provided. So `Tuple` would emit a warning, whereas `Tuple[Any, ...]` would not. This warning has proven very useful to many users, and I don't want to eliminate it, but I'd probably need to eliminate it (at least in some contexts) if we advocated the use of `T = TypeVar("T", bound=Tuple)` in this PEP.
On Thu, Jan 21, 2021 at 10:00 AM Eric Traut <eric@traut.com> wrote:
I recommend against using `bound=Tuple` for the following reasons:
1. If we drop support for packed (non-unpacked?) usage, it's entirely unnecessary and is just extra verbosity that provides no value.
Right.
2. I think it will confuse many users because it means that the `bound` has a different semantic meaning if the TypeVar is used in both a variadic and non-variadic context.
But it doesn't really have a different meaning, does it? Apart from #4 below `bound=Tuple` just means `bound=`Tuple[Any, ...]` and that's exactly the right upper bound for a type variable used with `*T`, since the *actual* type is always *some* tuple but we can't say more about it. And if someone wants to have a variadic argument whose items are always instances of some type A, they can use `bound=Tuple[A, ...]`. (Similar to how in TypeScript you could use `<T extends A[]>`.)
3. It eliminates the possibility of using `bound` for variadic TypeVars in the future. I think there are real use cases that we will want to enable here. For example, a class that accepts an arbitrary number of generic types where all of those types needs to be sized. This could be expressed as:
```python _T = TypeVar("_T", bound=Sized) class MyClass(Generic[*_T]): ... ```
Using my current proposal you can express that with `bound=Tuple[Sized, ...]`. (I wonder if you interpreted my examples too literally and assumed that I wanted to give a special meaning to exactly `bound=Tuple`? I merely intended to indicate that we might as well use `bound=Tuple` because that upperbound is implied by the `*T` notation. Or perhaps I meant that `Ts = TypeVarTuple('Ts')` is for all intents and purposes implying `bound=Tuple`. :-)
4. Python developers are still getting used to typing and generics, and I've found that many of them forget to (or don't understand that they need to) supply type arguments when using a generic type within a type expression. For that reason, Pyright emits a warning (or an error in "strict" mode) if type arguments are not provided. So `Tuple` would emit a warning, whereas `Tuple[Any, ...]` would not. This warning has proven very useful to many users, and I don't want to eliminate it, but I'd probably need to eliminate it (at least in some contexts) if we advocated the use of `T = TypeVar("T", bound=Tuple)` in this PEP.
Yeah, mypy can emit a similar warning. I guess we don't need to specify the bound since it is implied by `*T`. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
I had a chance to review the latest draft. Here's my feedback... In the latest draft, TypeTupleVar and TypeVarTuple seem to be used interchangeably. I assume this is just a typo and that one will be preferred over the other. Why do we think it’s important to support both packed and unpacked versions of TypeVarTuple? Why not simplify this and support with only unpacked versions? I find it very confusing that both are supported, and I see relatively little value in supporting the non-unpacked version. This simplification would also allow us to eliminate TypeVarTuple and stick with TypeVar. The expression `*T` would denote a variadic TypeVar, and `T` would be a non-variadic. The spec indicates that variance and bound are not supported. I presume that constrained (restricted) types are also not supported? If so, that should be specified. I presume that if a generic class can use at most one TypeVarTuple, and it’s an error if more than one is used. I also presume that the TypeVarTuple must be last in the type variable list, and it’s an error if not. These assumptions should be called out explicitly in the spec. [When I got further into the spec, this question was answered, but contrary to my assumptions here.] The spec talks about an unpacked TypeVarTuple used with *args and says that a TypeVarTuple can’t be used with **kwargs. Am I correct in assuming that an unpacked TypeVarTuple cannot be used with a simple parameter (not *args or **kwargs)? If so, this should be specified. The spec says that TypeVarTuples can be used with Callable. The sample shows a single use of a TypeVarTuple, and it’s unpacked. I presume that at most one unpacked TypeVarTuple can appear in a Callable parameter list and it must be the last parameter in the list? The spec says that TypeVarTuples can be used with Union. The sample shows a single use of TypeVarTuple, and it’s unpacked. I presume that more than one TypeVarTuple can appear in a Union? And all must be unpacked? I understand the motivation behind adding `Map` to this spec, but I don’t think it’s a very strong motivation. I would _strongly_ advocate for its removal. I think it unnecessarily complicates an already complex PEP. It effectively adds higher-kinded type support but in a limited and half-thought-out manner. To support `Map`, type checkers will need to do most of the work required to support higher-kinded types. That’s a lot of work with relatively little benefit. I’d rather save this functionality for a future PEP that introduces higher-kinded types in a proper and holistic manner. Map appears to work only with generic types that have an arity of 1. Am I correct in that assumption? If so, it feels very constraining. In the section on concatenation, the spec’s example shows that the unpacked TypeVarTuple (`*Shape`) appears as the last element in the subscript. Since it’s used for TypeVar matching, I presume that at most one TypeVarTuple can appear in a list like this and it needs to be unpacked. If my assumptions are correct, those should be documented. [When I got further into the spec, this question was answered, but contrary to my assumptions here.] The concatenation indicates that both prefixing and suffixing are supported. How important is the use case for suffixing? It adds some (not insignificant) complexity to the implementation to support this in a type checker, and I’m not convinced the use cases merit this added complexity. Can you provide some concrete use cases that could help make the case for suffixing? The section “Concatenating Multiple Type Variable Tuples” clarifies some of my questions above but provides an answer I was hoping not to hear. Supporting multiple TypeVariableTuples adds _significant_ complexity, and in most use cases I can think of will result in ambiguities (and therefore errors). The spec provides some examples of where these ambiguities can be resolved, but my preference is to disallow the use of multiple type variable tuples in all cases. This is another case of where the added complexity doesn’t seem to be merited given the limited usage. Summary: I think there are some great ideas in this PEP, but I also think it’s unnecessarily complex in some areas. My recommendations (in priority order): 1. Remove `Map`. Please! 2. Remove support for multiple TypeVarTuples in a type expression. 3. Remove support for unpacked usage. Don’t introduce new TypeVarTuple and use regular TypeVar but with unpack to indicate a TypeVarTuple. 4. Remove support for concatenation suffixing. -- Eric Traut Contributor to pyright and pylance Microsoft Corp.
I started to implement parts of the draft PEP 646 within pyright this evening. I figured this exercise might be helpful in informing our discussion and teasing out additional questions and issues. Here's what I've uncovered so far. *Grammar Changes* The grammar will need to change to support star expressions (i.e. unpack operator) within type annotations and in subscripts. If unpack operators are permitted within a subscript, how will that be handled at runtime? For example, in the expression `x[A, *B]`, what value will be passed to the `__getitem__` method for instance `x`? Will the runtime effectively replace `*B` with `typing.Unpack[B]`? Will star expressions be allowed in slice expressions? I presume no. *Zero-length Variadics* It's legal to pass no arguments to a `*args` parameter. For example: ```python def foo(*args: Any): ... foo() # This is fine ``` *Unknown-length Variadics* Am I correct in assuming that it's not OK to pass zero arguments to a `*args` parameter that has a variadic TypeVar annotation? ```python def foo(*args: *T): ... foo() I presume this is an error? ``` Also, it's generally OK to pass an arbitrary-length list of arguments to an `*args` parameter using an unpack operator. ```python def foo(*args: Any): ... def bar(x: Tuple[int, ...]): foo(*x) # This is allowed ``` I presume that it should be an error pass an arbitrary-length list of arguments to an `*args` parameter if it has a variadic TypeVar annotation? ```python def foo(*args: *T): ... def bar(x: Tuple[int, ...], y: Iterable[int], z: Tuple[int]): foo(*x) # I presume this is an error? foo(*y) # I presume this is an error? foo(*z) # This is allowed ``` If my assumption is correct that this should be flagged as an error by a type checker, will it also be a runtime error? I'm guessing the answer is no, there's no way to distinguish such an error at runtime. If my assumption is incorrect and this permitted, does the variadic type variable effectively become "open-ended" (i.e. the dimensionality of the variadic becomes unknown)? If so, how does it work with concatenation? I think it's better to make this an error. * Other Observations * This won't be an easy PEP to implement in type checkers, even with the simplifications I've recommended. It's going to be a heavy lift, which means it's going to be a long time before all type checkers support it. Compared to other recent type-related PEPs like 604, 612, 613, and 647, this PEP will require significantly more work to implement and get everything right. That could significantly delay the time before it can be used in typeshed and other public stubs. This bolsters my conviction that we should embrace simplifications where possible. After this exercise, I'm even more convinced that we should support only unpacked usage ("*T") and not support packed ("T") for variadic type variables — and that we should use the existing TypeVar rather than introducing TypeVarTuple. In the rare cases where a packed version is desired, it can be expressed as `Tuple[*T]`. For example: ```python T = TypeVar("T") def func(*args: *T) -> Tuple[*T]: ... ``` By requiring `*T` everywhere for a variadic type variable, we will reduce confusion for users and simplify the spec and implementation. -- Eric Traut Contributor to pyright and pylance Microsoft Corp.
I understand the motivation behind adding Map to this spec, but I don't
That's a lot of work with relatively little benefit. I'd rather save this functionality for a future PEP that introduces higher-kinded types in a
Thanks for the review and thoughtful questions, Eric. We had been curious about opinions from Mypy and Pyright in terms of implementation complexity. think it's a very strong motivation. I would _strongly_ advocate for its removal. I think it unnecessarily complicates an already complex PEP. It effectively adds higher-kinded type support but in a limited and half-thought-out manner. To support Map, type checkers will need to do most of the work required to support higher-kinded types. proper and holistic manner. You're right, the PEP special-cases `Map` instead of properly supporting higher-order higher-kinded types. The rest of the PEP's features are useful for Tensor functions, whereas `Map` is useful mainly for variadic functions like `map`, `zip`, and `asyncio.gather`. These use cases seem quite orthogonal.
The section "Concatenating Multiple Type Variable Tuples" clarifies some of my questions above but provides an answer I was hoping not to hear. Supporting multiple TypeVariableTuples adds _significant_ complexity, and in most use cases I can think of will result in ambiguities (and therefore errors). The spec provides some examples of where these ambiguities can be resolved, but my preference is to disallow the use of multiple type variable tuples in all cases. This is another case of where the added complexity doesn't seem to be merited given the limited usage.
I agree this does add significant complexity. The main usage is in Tensor functions, which require such prefix removal. Concatenation of multiple variadics will become important going forward with type arithmetic. For example, Alfonso had collected quite a few functions where, in the future, we would need to match, strip, or add a prefix. For example, `def mean(t: Tensor[*Ts, T, *Rs], axis: Length[Ts]) -> Tensor[*Ts, *Rs]`: https://docs.google.com/document/d/1IlByrIjZPPxTa_1ZWHLZDaQrxdGrAbL71plzmXi0... . (I'll address your individual questions separately after we've settled the below.) ***** Overall, I agree that the PEP is pretty complex to implement. However, features like concatenation of variadics are crucial for typing common tensor functions (which is the underlying motivation behind the PEP). Would it be better to split this into smaller PEPs? 1. Variadic tuples with no `Map` and no concatenation of variadics 2. Concatenation of variadic tuples 3. `Map` and other higher-kinded types This would let us incrementally evaluate the cost-benefit ratio. We wouldn't have to close the door on key features because they are complex to implement in one go. And the subsequent features are backward-compatible, so that shouldn't be a problem. (1) will cover basic Tensor dimension-checking. (2) will be useful for more advanced Tensor functions that require removing a prefix or a particular dimension. (3) will be useful mainly for variadic functions that transform `*args`. Guido, Matthew: thoughts? On Thu, Jan 21, 2021 at 10:09 PM Eric Traut <eric@traut.com> wrote:
I started to implement parts of the draft PEP 646 within pyright this evening. I figured this exercise might be helpful in informing our discussion and teasing out additional questions and issues. Here's what I've uncovered so far.
*Grammar Changes* The grammar will need to change to support star expressions (i.e. unpack operator) within type annotations and in subscripts. If unpack operators are permitted within a subscript, how will that be handled at runtime? For example, in the expression `x[A, *B]`, what value will be passed to the `__getitem__` method for instance `x`? Will the runtime effectively replace `*B` with `typing.Unpack[B]`?
Will star expressions be allowed in slice expressions? I presume no.
*Zero-length Variadics* It's legal to pass no arguments to a `*args` parameter. For example: ```python def foo(*args: Any): ... foo() # This is fine ```
*Unknown-length Variadics* Am I correct in assuming that it's not OK to pass zero arguments to a `*args` parameter that has a variadic TypeVar annotation? ```python def foo(*args: *T): ... foo() I presume this is an error? ```
Also, it's generally OK to pass an arbitrary-length list of arguments to an `*args` parameter using an unpack operator. ```python def foo(*args: Any): ... def bar(x: Tuple[int, ...]): foo(*x) # This is allowed ``` I presume that it should be an error pass an arbitrary-length list of arguments to an `*args` parameter if it has a variadic TypeVar annotation? ```python def foo(*args: *T): ... def bar(x: Tuple[int, ...], y: Iterable[int], z: Tuple[int]): foo(*x) # I presume this is an error? foo(*y) # I presume this is an error? foo(*z) # This is allowed ``` If my assumption is correct that this should be flagged as an error by a type checker, will it also be a runtime error? I'm guessing the answer is no, there's no way to distinguish such an error at runtime. If my assumption is incorrect and this permitted, does the variadic type variable effectively become "open-ended" (i.e. the dimensionality of the variadic becomes unknown)? If so, how does it work with concatenation? I think it's better to make this an error.
* Other Observations * This won't be an easy PEP to implement in type checkers, even with the simplifications I've recommended. It's going to be a heavy lift, which means it's going to be a long time before all type checkers support it. Compared to other recent type-related PEPs like 604, 612, 613, and 647, this PEP will require significantly more work to implement and get everything right. That could significantly delay the time before it can be used in typeshed and other public stubs. This bolsters my conviction that we should embrace simplifications where possible.
After this exercise, I'm even more convinced that we should support only unpacked usage ("*T") and not support packed ("T") for variadic type variables — and that we should use the existing TypeVar rather than introducing TypeVarTuple. In the rare cases where a packed version is desired, it can be expressed as `Tuple[*T]`. For example:
```python T = TypeVar("T") def func(*args: *T) -> Tuple[*T]: ... ```
By requiring `*T` everywhere for a variadic type variable, we will reduce confusion for users and simplify the spec and implementation.
-- Eric Traut Contributor to pyright and pylance Microsoft Corp. _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: gohanpra@gmail.com
-- S Pradeep Kumar
On Fri, Jan 22, 2021 at 10:55 AM S Pradeep Kumar <gohanpra@gmail.com> wrote:
[...] Overall, I agree that the PEP is pretty complex to implement. However, features like concatenation of variadics are crucial for typing common tensor functions (which is the underlying motivation behind the PEP).
Would it be better to split this into smaller PEPs?
1. Variadic tuples with no `Map` and no concatenation of variadics 2. Concatenation of variadic tuples 3. `Map` and other higher-kinded types
Can you clarify what "no concatenation of variadics" refers to? Does this mean we can't (yet) have `Tuple[int, *Ts]`? Or is that specifically about `Tuple[*Ts1, *Ts2]`. (And what about the same constructs inside `Callable[[<here>], R]`? -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
On Thu, Jan 21, 2021 at 10:09 PM Eric Traut <eric@traut.com> wrote:
I started to implement parts of the draft PEP 646 within pyright this evening. I figured this exercise might be helpful in informing our discussion and teasing out additional questions and issues. Here's what I've uncovered so far.
This is awesome work -- I always learn so much about a design by trying to implement it! Your experiences and observations are very useful. (I wonder if any of the PEP authors have worked on an implementation yet? Or anyone else?)
*Grammar Changes* The grammar will need to change to support star expressions (i.e. unpack operator) within type annotations and in subscripts. If unpack operators are permitted within a subscript, how will that be handled at runtime? For example, in the expression `x[A, *B]`, what value will be passed to the `__getitem__` method for instance `x`? Will the runtime effectively replace `*B` with `typing.Unpack[B]`?
We could give `TypeVar()` an `__iter__()` method like this: ``` def __iter__(self): yield f"*{self.__name__}" ``` That would make for a nice repr() of things like `tuple[*T]`. It wouldn't support runtime analysis -- we could support that by creating another helper object, but honestly I don't think that's going to be very useful. This should be spelled out in the PEP though in case there are people who *do* plan on doing runtime analysis on types involving `*T`.
Will star expressions be allowed in slice expressions? I presume no.
Nope.
*Zero-length Variadics* It's legal to pass no arguments to a `*args` parameter. For example: ```python def foo(*args: Any): ... foo() # This is fine ```
*Unknown-length Variadics* Am I correct in assuming that it's not OK to pass zero arguments to a `*args` parameter that has a variadic TypeVar annotation? ```python def foo(*args: *T): ... foo() I presume this is an error? ```
I had assumed this would be allowed -- the type would be that of the empty tuple (spelled `Tuple[()]`). I haven't looked but I could imagine that some of the many functions in some of the popular array/tensor packages might be inconvenienced if this were disallowed. (PS: That's a separate piece of grammar that's not covered by PEP 637, we should call this out in PEP 646.)
Also, it's generally OK to pass an arbitrary-length list of arguments to an `*args` parameter using an unpack operator. ```python def foo(*args: Any): ... def bar(x: Tuple[int, ...]): foo(*x) # This is allowed ``` I presume that it should be an error pass an arbitrary-length list of arguments to an `*args` parameter if it has a variadic TypeVar annotation? ```python def foo(*args: *T): ... def bar(x: Tuple[int, ...], y: Iterable[int], z: Tuple[int]): foo(*x) # I presume this is an error? foo(*y) # I presume this is an error? foo(*z) # This is allowed ```
Could we make `foo(*x)` work? At least this could work: ``` def foo(*args: *T) -> Tuple[*T]: return args def bar(*x: *int): y = foo(*x) ``` The type of y would be `Tuple[int, ...]`.
If my assumption is correct that this should be flagged as an error by a type checker, will it also be a runtime error? I'm guessing the answer is no, there's no way to distinguish such an error at runtime.
Which makes it more questionable to disallow it in the checker. (Of course there's plenty of working code that's disallowed by checkers, but this seems an odd edge condition to differ about.) If my assumption is incorrect and this permitted, does the variadic type
variable effectively become "open-ended" (i.e. the dimensionality of the variadic becomes unknown)? If so, how does it work with concatenation? I think it's better to make this an error.
It's easier for the checker implementation. :-) It doesn't strike me as theoretically unsound though. The length may be unknown at compile time, but it is not infinite, so things like `Tuple[int, *T, str]` are still well-defined at runtime.
* Other Observations * This won't be an easy PEP to implement in type checkers, even with the simplifications I've recommended. It's going to be a heavy lift, which means it's going to be a long time before all type checkers support it. Compared to other recent type-related PEPs like 604, 612, 613, and 647, this PEP will require significantly more work to implement and get everything right. That could significantly delay the time before it can be used in typeshed and other public stubs. This bolsters my conviction that we should embrace simplifications where possible.
I am sensitive to this observation. We could either plan follow-up PEPs with more advanced features, or we could define multiple phases of support in PEP 646 itself.
After this exercise, I'm even more convinced that we should support only unpacked usage ("*T") and not support packed ("T") for variadic type variables — and that we should use the existing TypeVar rather than introducing TypeVarTuple. In the rare cases where a packed version is desired, it can be expressed as `Tuple[*T]`. For example:
```python T = TypeVar("T") def func(*args: *T) -> Tuple[*T]: ... ```
By requiring `*T` everywhere for a variadic type variable, we will reduce confusion for users and simplify the spec and implementation.
But why couldn't `-> T` work for that example? Well, maybe you're right, if a user were to get confused and write `-> Tuple[T]` where they meant `-> Tuple[*T]` that would be quite the mess to sort out. FWIW the thing that always makes my brain hurt is the `*args: *T` notation. I try to use the heuristic that for the case `*args: X` the type of `args` is `Tuple[X, ...]`. From that I can usually make the derivation that for `*args: *T` the type of `args` is `Tuple[*T, ...]`. And then I just have to realize that the `...` isn't needed here. (Which is easier if you pretend that `Tuple[str, int, ...]` means a tuple of a str followed by zero or more ints.) -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
After some additional thinking — and in consideration of your responses, I have a few more thoughts to share. I thought of a way that we could avoid all changes to the grammar and eliminate the need for `Unpack`. Assuming we remove support for packed usages of variadic type vars as I previously suggested, that would leave only unpacked usages. And if all usages are unpacked, there's no reason to require the expression `*T` or `Unpack[T]` as long as "T" is designated a variadic type var when it is defined. We could augment the existing TypeVar constructor to support a parameter called `variadic`. The only downside I see is that the use of the star makes it clear to the reader of the code that the type var is variadic, but I think we could accomplish this through a naming convention like `Ts` or `T_va`. Here's how that would look: ```python T_va = TypeVar("T_va", variadic=True) class Array(Generic[T_va]): def __init__(self, *args: T_va) -> None: pass def linearize(self, value: Array[T_va]) -> Sequence[Union[T_va]]: pass ``` I like the simplicity of this. It's easy to read, easy to implement, provides backward compatibility with earlier parsers, and avoids the need to introduce an alternative form like `Unpack`. Guido clarified that all classes that support variadic type parameters should support zero-length forms. That sounds reasonable to me. Guido also said that unspecified-length (open-ended) variadics should be supported. I'm less convinced here. This creates complications for the type constraint solver. It also creates a bunch of tricky ambiguities for concatenation. I'll continue to play around with this in the implementation and let you know what I learn. I like the idea of breaking this PEP into the three pieces as suggested. I could see us including some limited forms of concatenation in phase 1. In particular, we could support type argument lists that contain at most one variadic type var where the variadic type var is the last element. For example, `Tuple[int, T_va]` would be allowed but `Tuple[T_va, int]` and `Tuple[S_va, T_va]` would be errors. This would allow for removal of prefixes and basic concatenation use cases, and I think it would be relatively straightforward to implement. Support for suffixes is much tougher. And I'm not yet convinced that support for multiple variadic type vars is even theoretically possible in the general case. -- Eric Traut Contributor to pyright and pylance Microsoft Corp.
Damn, sorry it's only now I'm weighing in on all this - it's been a busy week. Thanks for this excellent feedback, Eric - and also for teasing out some of the implications through an initial implementation! This is super helpful :) **Breaking the PEP into three pieces**: this is a great idea, Pradeep. I'll try to get some drafts done in the next few days. **Whether to support both packed and unpacked versions**: funnily enough, supporting only the unpacked version (in the form `Tuple[Ts]`) is what we initially started with. I had to do some digging in the doc history to figure out why we decided to switch. I think what happened was: * Someone suggested renaming `Map` to something different to avoid confusion with `typing.Mapping` * At some point during the discussion, without realising it, I got confused about the difference between the concept of `Map` (`Map[List, Ts] -> Tuple[List[T1], List[T2], ...]`) and the concept of connecting generic classes to variadic type variables (this isn't what we were calling it even back then, but for illustration: `UnpackInto[Ts, Tuple] -> Tuple[T1, T2, ...]` * As a result, we renamed `Map` to `Apply` 😱 * But then after I'd read about variadics in Typed Scheme and what `apply` means in Typed Scheme (and also what `apply` did in old versions of Python), I realised that calling `Map` `Apply` made no sense, and renamed it back to `Map`, and introduced `Expand` for the other concept: `Tuple[Expand[Ts]]`. * Then Lucio pointed out that `Expand` was super verbose and that using a star would be cleaner, which made a lot of sense. Thus, complexity was born. [image: image.png] To be fair, I also liked the star thing for two reasons: * It helped to visually differentiate variadic type variables from regular type variables (though Eric, I agree that suffixing with e.g. `_va` or `s` is an easier way to do this) * It strongly reinforces the idea that a variadic type variable on its own behaves like a `Tuple` - which I liked because then it was intuitive was a variadic type variable on its own 'was' (that is, if `Ts` on its own meant the unpacked version, then a variadic type variable was sort of a free-floating list of types that I didn't have a pre-existing mental concept for, and that seemed confusing to me) The latter still seems compelling to me - but the counter-arguments are admittedly also significant: * The complications of i) a grammar change and ii) the introduction of `Unpack` * The potential for confusion to users about when to use `Ts` and when to use `*Ts` As much as it pains me to kill my darling, I find myself leaning towards agreeing that maybe it would be better if we dropped the star idea. Thanks for being willing to disagree, Eric, and potentially break us out of a rut here! I still reserve the right to change my mind once I've tried making the corresponding changes to the PEP, though. Two potential blockers that come to mind straight away are: 1. What should the type of `*args` be? We could just go back to `*args: Ts`, with a special rule that if `*args` is annotated as being a variadic type variable then the usual rule doesn't apply. Guido, how would you feel about that? 2. What if a function wants to return just the types in the variadic type variable? Actually, I'm happy to just do `-> Tuple[Ts]`. Does anyone else have a problem with this? **Implementation**: no, we haven't started on an implementation yet; we were waiting to hear back from the PEP 637 folks/I've been busy with other things. Will respond to the other issues soon :) On Sat, 23 Jan 2021 at 06:23, Eric Traut <eric@traut.com> wrote:
After some additional thinking — and in consideration of your responses, I have a few more thoughts to share.
I thought of a way that we could avoid all changes to the grammar and eliminate the need for `Unpack`. Assuming we remove support for packed usages of variadic type vars as I previously suggested, that would leave only unpacked usages. And if all usages are unpacked, there's no reason to require the expression `*T` or `Unpack[T]` as long as "T" is designated a variadic type var when it is defined. We could augment the existing TypeVar constructor to support a parameter called `variadic`. The only downside I see is that the use of the star makes it clear to the reader of the code that the type var is variadic, but I think we could accomplish this through a naming convention like `Ts` or `T_va`.
Here's how that would look:
```python T_va = TypeVar("T_va", variadic=True)
class Array(Generic[T_va]): def __init__(self, *args: T_va) -> None: pass
def linearize(self, value: Array[T_va]) -> Sequence[Union[T_va]]: pass ```
I like the simplicity of this. It's easy to read, easy to implement, provides backward compatibility with earlier parsers, and avoids the need to introduce an alternative form like `Unpack`.
Guido clarified that all classes that support variadic type parameters should support zero-length forms. That sounds reasonable to me.
Guido also said that unspecified-length (open-ended) variadics should be supported. I'm less convinced here. This creates complications for the type constraint solver. It also creates a bunch of tricky ambiguities for concatenation. I'll continue to play around with this in the implementation and let you know what I learn.
I like the idea of breaking this PEP into the three pieces as suggested. I could see us including some limited forms of concatenation in phase 1. In particular, we could support type argument lists that contain at most one variadic type var where the variadic type var is the last element. For example, `Tuple[int, T_va]` would be allowed but `Tuple[T_va, int]` and `Tuple[S_va, T_va]` would be errors. This would allow for removal of prefixes and basic concatenation use cases, and I think it would be relatively straightforward to implement. Support for suffixes is much tougher. And I'm not yet convinced that support for multiple variadic type vars is even theoretically possible in the general case.
-- Eric Traut Contributor to pyright and pylance Microsoft Corp. _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: mrahtz@google.com
On Sat, Jan 23, 2021 at 5:52 AM Matthew Rahtz <mrahtz@google.com> wrote:
[...] **Whether to support both packed and unpacked versions**: funnily enough, supporting only the unpacked version (in the form `Tuple[Ts]`) is what we initially started with. I had to do some digging in the doc history to figure out why we decided to switch. I think what happened was:
Looks like we have come full circle here. :-)
* Someone suggested renaming `Map` to something different to avoid confusion with `typing.Mapping` * At some point during the discussion, without realising it, I got confused about the difference between the concept of `Map` (`Map[List, Ts] -> Tuple[List[T1], List[T2], ...]`) and the concept of connecting generic classes to variadic type variables (this isn't what we were calling it even back then, but for illustration: `UnpackInto[Ts, Tuple] -> Tuple[T1, T2, ...]` * As a result, we renamed `Map` to `Apply` 😱 * But then after I'd read about variadics in Typed Scheme and what `apply` means in Typed Scheme (and also what `apply` did in old versions of Python), I realised that calling `Map` `Apply` made no sense, and renamed it back to `Map`, and introduced `Expand` for the other concept: `Tuple[Expand[Ts]]`. * Then Lucio pointed out that `Expand` was super verbose and that using a star would be cleaner, which made a lot of sense.
I never followed the Typed Scheme thing. But I've been learning about the corresponding functionality in TypeScript, and it uses `...X`, here which is JavaScript's and TypeScript's spelling of `*X` in various places. I also find it the `*` useful hint for the user: When I see `Tuple[*X]` I immediately know that the length of the tuple is variadic. If I were to see `Tuple[Ts]` I'd have to remember that the naming convention tells me that the tuple is variadic. Readability Counts.
Thus, complexity was born.
[image: image.png]
To be fair, I also liked the star thing for two reasons:
* It helped to visually differentiate variadic type variables from regular type variables (though Eric, I agree that suffixing with e.g. `_va` or `s` is an easier way to do this) * It strongly reinforces the idea that a variadic type variable on its own behaves like a `Tuple` - which I liked because then it was intuitive was a variadic type variable on its own 'was' (that is, if `Ts` on its own meant the unpacked version, then a variadic type variable was sort of a free-floating list of types that I didn't have a pre-existing mental concept for, and that seemed confusing to me)
The latter still seems compelling to me - but the counter-arguments are admittedly also significant:
* The complications of i) a grammar change and ii) the introduction of `Unpack`
I think the grammar change is for the best. It values explicitness over magic. `Unpack` is temporary, only needed for backwards compatibility.
* The potential for confusion to users about when to use `Ts` and when to use `*Ts`
If you think about Ts as a sequence it should be clear -- when doing computations with values, users know when to use f(*args) vs. f(args).
As much as it pains me to kill my darling, I find myself leaning towards agreeing that maybe it would be better if we dropped the star idea. Thanks for being willing to disagree, Eric, and potentially break us out of a rut here!
I still reserve the right to change my mind once I've tried making the corresponding changes to the PEP, though. Two potential blockers that come to mind straight away are:
1. What should the type of `*args` be?
We could just go back to `*args: Ts`, with a special rule that if `*args` is annotated as being a variadic type variable then the usual rule doesn't apply. Guido, how would you feel about that?
No matter what we do, `*args` needs to be some kind of special case (because PEP 484 screwed up here).
2. What if a function wants to return just the types in the variadic type variable?
Actually, I'm happy to just do `-> Tuple[Ts]`. Does anyone else have a problem with this?
And with the star, we could require writing `-> Tuple[*Ts]`. (Again, EIBTI.) -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
On 1/22/21 10:22 PM, Eric Traut wrote:
I thought of a way that we could avoid all changes to the grammar and eliminate the need for `Unpack`. [...] We could augment the existing TypeVar constructor to support a parameter called `variadic`. The only downside I see is that the use of the star makes it clear to the reader of the code that the type var is variadic, but I think we could accomplish this through a naming convention like `Ts` or `T_va`.
Here's how that would look:
```python T_va = TypeVar("T_va", variadic=True)
1. I'm a bit uncomfortable introducing a naming convention that makes significant changes to typing interpretation. That feels rather subtle. 2. RE: variadic=True, I'll echo an older comment of mine that "variadic" is a bit of mouthful and hard to spell. Instead we might consider something less technical like `tuple=True` or `many=True`. -- David Foster | Seattle, WA, USA Contributor to TypedDict support for mypy
I know I've been a bit of a lurker since this discussion got resurrected but:
2. RE: variadic=True, I'll echo an older comment of mine that "variadic" is a bit of mouthful and hard to spell. Instead we might consider something less technical like `tuple=True` or `many=True`.
The thing I like about a mouthful like "variadic" is that if you encounter it and don't know what it means you know you need to go look it up. If you see "variadic type" a google search is going to lead you to answers; if you try that for "many type" or "tuple type" it's not.
*Jukka, Ivan, Rebecca: *You can skip to your names below. ***Constructor argument naming***
The thing I like about a mouthful like "variadic" is that if you encounter it and don't know what it means you know you need to go look it up.
Hmm, good point. Let's keep this argument in mind if we do decide to use `TypeVar` as the constructor (which I'm still not sure we should). ***Packed vs unpacked***
I'm a bit uncomfortable introducing a naming convention that makes significant changes to typing interpretation. That feels rather subtle.
If I were to see `Tuple[Ts]` I'd have to remember that the naming convention tells me that the tuple is variadic. Readability Counts. I think the grammar change is for the best. It values explicitness over magic.
Hearing this opinion from Guido in particular updates me significantly. Still, I find myself wondering whether the small improvement in readability (of something that's only likely to be used in library code and therefore not terribly user-facing) is worth the cost to updating the parsers of at least CPython, Mypy, pytype and Pyright. The main crux for me here is the exact degree of difficulty. Eric, when you said it would be a 'heavy lift', were you thinking mainly because of the complications of concatenation and Map? Based on your experiments in Pyright so far, how difficult would introducing the new grammar be? (***Jukka, Ivan, Rebecca***, it would be super useful to hear your thoughts here too. In case a catch-up would be helpful: the question is, if we wanted to make it so that our new 'variadic type variable' was used as `Ts = TypeVarTuple('Ts'); class Foo(Generic[*Ts]): ...; foo: Foo[int, str] = Foo()`, how hard would that be, considering the new use of the star?) ***PEP splitting*** The latest draft of the PEP is: https://github.com/python/peps/pull/1781 I've split `Map` and fancier concatenation off into separate documents ( https://docs.google.com/document/d/1szTVcFyLznoDT7phtT-6Fpvp27XaBw9DmbTLHrB6... and https://docs.google.com/document/d/1sUBlow40J7UwdTSyRYAj34ozkGOlMEjPaVEWeOmM..., respectively, though haven't cleaned them up yet). I've also tried tentatively rewriting it so that a `TypeVarTuple` behaves as if unpacked by default, eliminating the star. Everything does still seem to work, and admittedly the PEP seems much simpler. ***Concatenation***
Can you clarify what "no concatenation of variadics" refers to? Does this mean we can't (yet) have `Tuple[int, *Ts]`? Or is that specifically about `Tuple[*Ts1, *Ts2]`. (And what about the same constructs inside `Callable[[<here>], R]`?
I like Eric's proposal of a) only prefixing is allowed, and b) allowing only a single variadic type variable. For `Callable`, I think we shouldn't allow concatenation at all, at least not in this PEP - a) because it's simpler, b) because if we did `Callable[[int, Ts], R]` then the first argument would have to be positional-only, and that feels like it's going to have complications I haven't thought through yet, and c) because I expect most use-cases would be covered by PEP 612. I've updated the draft correspondingly. On Sun, 24 Jan 2021 at 23:33, Naomi Seyfer <naomi@seyfer.org> wrote:
I know I've been a bit of a lurker since this discussion got resurrected but:
2. RE: variadic=True, I'll echo an older comment of mine that "variadic" is a bit of mouthful and hard to spell. Instead we might consider something less technical like `tuple=True` or `many=True`.
The thing I like about a mouthful like "variadic" is that if you encounter it and don't know what it means you know you need to go look it up. If you see "variadic type" a google search is going to lead you to answers; if you try that for "many type" or "tuple type" it's not. _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: mrahtz@google.com
I agree that "variadic" is a term that casual Python coders may be unfamiliar with, but it's a pretty standard term used in other languages, as opposed to "covariant" and "contravariant", which I had never encountered prior to Python. I also don't think variadic type variables will be used by a typical Python coder. It's a pretty advanced feature. Most Python coders don't use simple (non-variadic) type variables today.
Based on your experiments in Pyright so far, how difficult would introducing the new grammar be?
Introducing the grammar change to allow the star operator within a subscript is easy, just a few dozen lines of new code. The difficult part is with all the error cases this introduces. The star operator is allowed only in the case of variadic type variables. All other uses of a star operator within a subscript are not allowed. A few of these cases can be detected and reported by the parser (e.g. when used in conjunction with slice expressions), but most require semantic information to detect, so the checks will need to be added in many places within the type checker — and presumably the runtime as well. When a new construct introduces many ways to produce new error conditions, my natural instinct is to look for a way to eliminate the possibility of those errors rather than trying to enumerate and plug each of them individually. The star operator will also require changes beyond the parser and type checker. It will also require updates to completion suggestion and signature help logic. This is all doable, but it adds significant work across many code bases and will result in many more bugs as we work out all of the kinks and edge cases. I'm not convinced that the readability benefits justify the added complexity. I think naming conventions could work fine here. After all, we've adopted naming conventions to designate covariant and contravariant type variables, and that seems to work fine. I'm continuing to work on the implementation in pyright (currently on a private branch). Of course, none of this is set in stone — I'm just trying to inform the discussion. Once I get a critical mass of functionality working, I'll merge the changes and give you a chance to play with them in Pyright. I find that it helps to be able to write real code with real tooling when playing with new language constructs. Here's what I have implemented so far: * Support for "variadic=True" in TypeVar constructor. * Support for a variadic TypeVar used at the end of a generic class declaration * Support for subscripts within type expressions that contain an arbitrary number of type arguments and matching of those type arguments to type parameters when the last type parameter is a variadic * Support for "()" (empty tuple) notation when used with variadic TypeVar * Support for "*args: Ts" matching * Support for zero-length matching What I haven't done yet: * Reporting error for bound, variance, or constraints used in conjunction with variadic TypeVar * Reporting errors for situations where a variadic TypeVar is used in cases where it shouldn't be * Reporting errors for situations where a variadic TypeVar is not used in cases where it is needed * Detecting and reporting errors for variadic TypeVar when it's not at the end of a list of TypeVars in a generic class declaration * Detecting and reporting errors for multiple variadic TypeVars appearing in a generic class declaration * Support for Union[Ts] * Support for Tuple[Ts] * Support for Concatenate[x, y, Ts] * Variadics in generic type aliases * Support for open-ended (arbitrary-length) variadics * Tests for all of the above I've run across a few additional questions: 1. PEP 484 indicates that if a type argument is omitted from a generic type, that type argument is assumed to be `Any`. What is the assumption with a variadic TypeVar? Should it default to `()` (empty tuple)? If we support open-ended tuples, then we could also opt for `(Any, ...)`. 2. What is the type of `def foo(*args: Ts) -> Union[Ts]` if foo is called with no arguments? In other words, what is the type of `Union[*()]`? Is it `Any`? Is this considered an error? 3. When the constraint solver is solving for a variadic type variable, does it need to solve for the individual elements of the tuple independently? Consider, for example, `def foo(a: Tuple[Ts], b: Tuple[Ts]) -> Tuple[Ts]`. Now, let's consider the expression `foo((3, "hi"), ("hi", 5.6))`? Would this be an error? Or would you expect that the constraint solver produce an answer of `Tuple[int | str, str | float]` (or `Tuple[object, object]`)? It's much easier to implement if we can treat this as an error, but I don't know if that satisfies the use cases you have in mind. 4. Along the lines of the previous question, consider the expression `foo((3, "hi"), ("hi", ))`. In this case, the lengths of the tuples don't match. If we don't support open-ended variadics, this needs to be an error. If we support open-ended variadics, we have the option of solving this as `Tuple[int | str, ...]` (or `Tuple[object, ...]`). Once again, it's easiest if we don't allow this and treat it as an error. -- Eric Traut Contributor to Pyright and Pylance Microsoft Corp.
Can you clarify what "no concatenation of variadics" refers to? Does this mean we can't (yet) have `Tuple[int, *Ts]`? Or is that specifically about `Tuple[*Ts1, *Ts2]`. (And what about the same constructs inside `Callable[[<here>], R]`?
But I've been learning about the corresponding functionality in TypeScript, and it uses `...X`, here which is JavaScript's and TypeScript's spelling of `*X` in various places. I also find it the `*` useful hint for
Guido: I mean concatenation of multiple variadics (`Tuple[*Ts, *Ts2]`, same within Callable). The first PEP will support concatenation with a non-variadic prefix and suffix (`Tuple[int, *Ts, str]`, same within Callable). the user: When I see `Tuple[*X]` I immediately know that the length of the tuple is variadic. If I were to see `Tuple[Ts]` I'd have to remember that the naming convention tells me that the tuple is variadic. Readability Counts. I strongly agree. `*Ts` is a clear visual analogy to tuple unpacking. # Reference implementation and examples
I wonder if any of the PEP authors have worked on an implementation yet
When the type argument for T is a union type, the union is spread over
Yes, I have been working on a reference implementation from scratch in Pyre. I added support over the weekend for concatenation of multiple variadics since that's one of the main points of uncertainty in our discussion. Here are some examples that typecheck on my branch: (1) Basic usage of non-variadic prefix and suffix ``` from typing import List, Tuple, TypeVar Ts = TypeVar("Ts", bound=tuple) Ts2 = TypeVar("Ts2", bound=tuple) def strip_both_sides(x: Tuple[int, *Ts, str]) -> Ts: ... def add_int(x: Tuple[*Ts]) -> Tuple[bool, *Ts, int]: ... def foo(xs: Tuple[int, bool, str]) -> None: z = add_bool_int(strip_both_sides(xs)) # => Tuple[bool, bool, int] reveal_type(z) ``` (2) `partial` (using concatenation of multiple variadics): ``` from typing import Callable, Tuple, TypeVar Ts = TypeVar("Ts", bound=tuple) Ts2 = TypeVar("Ts2", bound=tuple) def partial(f: Callable[[*Ts, *Ts2], bool], *args: *Ts) -> Callable[[*Ts2], bool]: ... def foo(x: int, y: str, z: bool) -> bool: ... def baz() -> None: expects_bool = partial(foo, 1, "hello") # Valid. expects_bool(True) # Error. expects_bool() # Note that this `partial` doesn't support keyword arguments. ``` Other notable features: + Variadic classes like Tensor. + `*args: *Ts` + Concatenation within parameters of Tuples, Callables, and variadic classes. + Allowing Tuples to be arbitrarily unpacked: `*Tuple[int, *Ts]` Other test cases: https://github.com/pradeep90/pyre-check/blob/master/source/analysis/test/int... Once we settle the debate about TypeVar vs TypeVarTuple and other syntax, I can work on merging this branch into Pyre master and implementing some other details from the PEP (such as `Union[*Ts]`, generic aliases, Map with generic aliases, etc.). # Key findings (1) Prefix and suffix work as expected Handling a suffix of non-variadic parameters `Tuple[*Ts, int]` is essentially the same work as handling a prefix `Tuple[int, *Ts]`. @Eric Could you share an example of a non-variadic suffix that you felt would be hard to tackle? I'm basically following the TypeScript algorithm for inferring one variadic tuple type against another ( https://github.com/microsoft/TypeScript/pull/39094). (2) Concatenation of multiple variadics Concatenation of multiple variadics works as expected when the length of one is unambiguously known. However, concatenation of multiple variadics is indeed significantly complex and should definitely *not* be part of the initial variadics PEP. (3) Unpack[Ts] / *Ts didn't pose a big problem I preprocessed `*Ts` to be `Unpack[Ts]` so that I could uniformly deal with `Unpack`. Pyre uses its own parser, so that's one thing to note. Also, the `*` syntax is going to be implemented as part of PEP 637, so I don't think the implementation should be a concern for us. Until it is implemented, we do have `Unpack`. (4) Surprising behavior: variance of Tuple vs Tensor ``` def foo(xs: Tuple[*Ts], ys: Tuple[*Ts]) -> Tuple[*Ts]: ... tuple_int_str: Tuple[int, str] tuple_int_bool: Tuple[int, bool] foo(tuple_int_str, tuple_int_bool) ``` We might expect a type error here because `Tuple[int, str] != Tuple[int, bool]`. However, the typechecker actually infers `Ts = Tuple[int, Union[str, bool]]`, which is a perfectly valid solution. This is sound but unintuitive. This is analogous to what we do for the non-variadic case: ``` def foo(xs: T, ys: T) -> T: ... some_int: int some_str: str foo(some_int, some_bool) ``` This is not a type error because `T = Union[int, str]` is a valid solution for T. (Mypy infers `T = object`, but it's the same principle.) Note, however, that we *do* raise an error with Tensor: ``` def bar(x: Tensor[*Ts], y: Tensor[*Ts]) -> Tensor[*Ts]: ... xs: Tensor[int, str] ys: Tensor[bool, bool] # Error bar(xs, ys) ``` This is because Tensor is invariant whereas Tuple is covariant. (FWIW, TypeScript doesn't do the above for either the variadic or non-variadic case. It raises an error if T is passed different types of arguments.) The Tuple covariance means we don't error in cases where we would expect to. Perhaps we can define variadic tuples as invariant by default. (5) What if there is an arity mismatch? Consider the following case (the same case as Eric pointed out :) ). ``` def foo(xs: Tuple[*Ts], ys: Tuple[*Ts]) -> Tuple[*Ts]: ... tuple_int_str: Tuple[int, str] tuple_bool: Tuple[bool] foo(tuple_int_str, tuple_bool) ``` We might expect this to error because the argument types have different lengths. However, Ts = Union[Tuple[int, str], Tuple[bool]] is a valid solution, since `Tuple` is covariant. `foo` gets treated as: ``` def foo(xs: Union[Tuple[int, str], Tuple[bool]], ys: Union[Tuple[int, str], Tuple[bool]]) -> Union[Tuple[int, str], Tuple[bool]]: ... ``` Users might be expecting this to error and might be taken aback, as I was when I tried it out. I experimented with disallowing a variadic `Ts` from being inferred as having two different lengths and that seemed somewhat more intuitive than the above. Opinions appreciated. # Open questions (1) What to do about `*Tuple[Any, ...]`? During the last tensor meeting, we discussed allowing `Tensor[Any, ...]` (and the equivalent `Tensor`) in order to aid gradual typing. Existing code annotated as `t: Tensor` would treat `Tensor` without parameters as `Tensor[Any, ...]`. That would be a Tensor with arbitrary rank and `Any` as the dimension type. This way, changing `class Tensor` to be a variadic wouldn't immediately break existing code. I'm yet to implement this, so I'll look into how this affects type inference. The same goes for `*Iterable[int]`, if indeed that is feasible. (2) What to do about `*Union[...]`? If `Ts` is a type variable bound by `tuple`, then `Ts = Union[Tuple[int, str], Tuple[bool]]` is a valid assignment. We then have to consider what unpacking that means. TypeScript allows this: the tuple type. For example, [A, ...T, B] instantiated with X | Y | Z as the type argument for T yields a union of instantiations of [A, ...T, B] with X, Y and Z as the type argument for T respectively. ****** I'll think about these questions a bit more over the next couple of weeks and update the PEP. We can discuss these in detail during the next tensor typing meeting. Best, On Mon, Jan 25, 2021 at 9:44 AM Eric Traut <eric@traut.com> wrote:
I agree that "variadic" is a term that casual Python coders may be unfamiliar with, but it's a pretty standard term used in other languages, as opposed to "covariant" and "contravariant", which I had never encountered prior to Python. I also don't think variadic type variables will be used by a typical Python coder. It's a pretty advanced feature. Most Python coders don't use simple (non-variadic) type variables today.
Based on your experiments in Pyright so far, how difficult would introducing the new grammar be?
Introducing the grammar change to allow the star operator within a subscript is easy, just a few dozen lines of new code. The difficult part is with all the error cases this introduces. The star operator is allowed only in the case of variadic type variables. All other uses of a star operator within a subscript are not allowed. A few of these cases can be detected and reported by the parser (e.g. when used in conjunction with slice expressions), but most require semantic information to detect, so the checks will need to be added in many places within the type checker — and presumably the runtime as well. When a new construct introduces many ways to produce new error conditions, my natural instinct is to look for a way to eliminate the possibility of those errors rather than trying to enumerate and plug each of them individually.
The star operator will also require changes beyond the parser and type checker. It will also require updates to completion suggestion and signature help logic.
This is all doable, but it adds significant work across many code bases and will result in many more bugs as we work out all of the kinks and edge cases. I'm not convinced that the readability benefits justify the added complexity. I think naming conventions could work fine here. After all, we've adopted naming conventions to designate covariant and contravariant type variables, and that seems to work fine.
I'm continuing to work on the implementation in pyright (currently on a private branch). Of course, none of this is set in stone — I'm just trying to inform the discussion. Once I get a critical mass of functionality working, I'll merge the changes and give you a chance to play with them in Pyright. I find that it helps to be able to write real code with real tooling when playing with new language constructs.
Here's what I have implemented so far: * Support for "variadic=True" in TypeVar constructor. * Support for a variadic TypeVar used at the end of a generic class declaration * Support for subscripts within type expressions that contain an arbitrary number of type arguments and matching of those type arguments to type parameters when the last type parameter is a variadic * Support for "()" (empty tuple) notation when used with variadic TypeVar * Support for "*args: Ts" matching * Support for zero-length matching
What I haven't done yet: * Reporting error for bound, variance, or constraints used in conjunction with variadic TypeVar * Reporting errors for situations where a variadic TypeVar is used in cases where it shouldn't be * Reporting errors for situations where a variadic TypeVar is not used in cases where it is needed * Detecting and reporting errors for variadic TypeVar when it's not at the end of a list of TypeVars in a generic class declaration * Detecting and reporting errors for multiple variadic TypeVars appearing in a generic class declaration * Support for Union[Ts] * Support for Tuple[Ts] * Support for Concatenate[x, y, Ts] * Variadics in generic type aliases * Support for open-ended (arbitrary-length) variadics * Tests for all of the above
I've run across a few additional questions:
1. PEP 484 indicates that if a type argument is omitted from a generic type, that type argument is assumed to be `Any`. What is the assumption with a variadic TypeVar? Should it default to `()` (empty tuple)? If we support open-ended tuples, then we could also opt for `(Any, ...)`.
2. What is the type of `def foo(*args: Ts) -> Union[Ts]` if foo is called with no arguments? In other words, what is the type of `Union[*()]`? Is it `Any`? Is this considered an error?
3. When the constraint solver is solving for a variadic type variable, does it need to solve for the individual elements of the tuple independently? Consider, for example, `def foo(a: Tuple[Ts], b: Tuple[Ts]) -> Tuple[Ts]`. Now, let's consider the expression `foo((3, "hi"), ("hi", 5.6))`? Would this be an error? Or would you expect that the constraint solver produce an answer of `Tuple[int | str, str | float]` (or `Tuple[object, object]`)? It's much easier to implement if we can treat this as an error, but I don't know if that satisfies the use cases you have in mind.
4. Along the lines of the previous question, consider the expression `foo((3, "hi"), ("hi", ))`. In this case, the lengths of the tuples don't match. If we don't support open-ended variadics, this needs to be an error. If we support open-ended variadics, we have the option of solving this as `Tuple[int | str, ...]` (or `Tuple[object, ...]`). Once again, it's easiest if we don't allow this and treat it as an error.
-- Eric Traut Contributor to Pyright and Pylance Microsoft Corp. _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: gohanpra@gmail.com
-- S Pradeep Kumar
Replying to Matthew's question:
(**Jukka, Ivan, Rebecca**, it would be super useful to hear your thoughts here too. In case a catch-up would be helpful: the question is, if we wanted to make it so that our new 'variadic type variable' was used as `Ts = TypeVarTuple('Ts'); class Foo(Generic[*Ts]): ...; foo: Foo[int, str] = Foo()`, how hard would that be, considering the new use of the star?)
For pytype in particular, this wouldn't be too bad. We use a Python interpreter in the target Python version to compile source code to bytecode and then analyze the bytecode, so grammar changes aren't any harder to accommodate than other types of changes. (pytype does have a separate stub parser that uses typed_ast at the moment, and we need to instead use the ast module in Python 3.8+ to parse newer syntax, but that's a change we need to make anyway and not specific to this PEP.) On Mon, Jan 25, 2021 at 10:58 AM S Pradeep Kumar <gohanpra@gmail.com> wrote:
Can you clarify what "no concatenation of variadics" refers to? Does this mean we can't (yet) have `Tuple[int, *Ts]`? Or is that specifically about `Tuple[*Ts1, *Ts2]`. (And what about the same constructs inside `Callable[[<here>], R]`?
Guido: I mean concatenation of multiple variadics (`Tuple[*Ts, *Ts2]`, same within Callable).
The first PEP will support concatenation with a non-variadic prefix and suffix (`Tuple[int, *Ts, str]`, same within Callable).
But I've been learning about the corresponding functionality in TypeScript, and it uses `...X`, here which is JavaScript's and TypeScript's spelling of `*X` in various places. I also find it the `*` useful hint for the user: When I see `Tuple[*X]` I immediately know that the length of the tuple is variadic. If I were to see `Tuple[Ts]` I'd have to remember that the naming convention tells me that the tuple is variadic. Readability Counts.
I strongly agree. `*Ts` is a clear visual analogy to tuple unpacking.
# Reference implementation and examples
I wonder if any of the PEP authors have worked on an implementation yet
Yes, I have been working on a reference implementation from scratch in Pyre.
I added support over the weekend for concatenation of multiple variadics since that's one of the main points of uncertainty in our discussion. Here are some examples that typecheck on my branch:
(1) Basic usage of non-variadic prefix and suffix
``` from typing import List, Tuple, TypeVar
Ts = TypeVar("Ts", bound=tuple) Ts2 = TypeVar("Ts2", bound=tuple)
def strip_both_sides(x: Tuple[int, *Ts, str]) -> Ts: ...
def add_int(x: Tuple[*Ts]) -> Tuple[bool, *Ts, int]: ...
def foo(xs: Tuple[int, bool, str]) -> None: z = add_bool_int(strip_both_sides(xs))
# => Tuple[bool, bool, int] reveal_type(z) ```
(2) `partial` (using concatenation of multiple variadics):
``` from typing import Callable, Tuple, TypeVar
Ts = TypeVar("Ts", bound=tuple) Ts2 = TypeVar("Ts2", bound=tuple)
def partial(f: Callable[[*Ts, *Ts2], bool], *args: *Ts) -> Callable[[*Ts2], bool]: ...
def foo(x: int, y: str, z: bool) -> bool: ...
def baz() -> None: expects_bool = partial(foo, 1, "hello")
# Valid. expects_bool(True)
# Error. expects_bool()
# Note that this `partial` doesn't support keyword arguments. ```
Other notable features:
+ Variadic classes like Tensor. + `*args: *Ts` + Concatenation within parameters of Tuples, Callables, and variadic classes. + Allowing Tuples to be arbitrarily unpacked: `*Tuple[int, *Ts]`
Other test cases: https://github.com/pradeep90/pyre-check/blob/master/source/analysis/test/int...
Once we settle the debate about TypeVar vs TypeVarTuple and other syntax, I can work on merging this branch into Pyre master and implementing some other details from the PEP (such as `Union[*Ts]`, generic aliases, Map with generic aliases, etc.).
# Key findings
(1) Prefix and suffix work as expected
Handling a suffix of non-variadic parameters `Tuple[*Ts, int]` is essentially the same work as handling a prefix `Tuple[int, *Ts]`.
@Eric Could you share an example of a non-variadic suffix that you felt would be hard to tackle?
I'm basically following the TypeScript algorithm for inferring one variadic tuple type against another ( https://github.com/microsoft/TypeScript/pull/39094).
(2) Concatenation of multiple variadics
Concatenation of multiple variadics works as expected when the length of one is unambiguously known.
However, concatenation of multiple variadics is indeed significantly complex and should definitely *not* be part of the initial variadics PEP.
(3) Unpack[Ts] / *Ts didn't pose a big problem
I preprocessed `*Ts` to be `Unpack[Ts]` so that I could uniformly deal with `Unpack`. Pyre uses its own parser, so that's one thing to note.
Also, the `*` syntax is going to be implemented as part of PEP 637, so I don't think the implementation should be a concern for us. Until it is implemented, we do have `Unpack`.
(4) Surprising behavior: variance of Tuple vs Tensor
``` def foo(xs: Tuple[*Ts], ys: Tuple[*Ts]) -> Tuple[*Ts]: ...
tuple_int_str: Tuple[int, str] tuple_int_bool: Tuple[int, bool] foo(tuple_int_str, tuple_int_bool) ```
We might expect a type error here because `Tuple[int, str] != Tuple[int, bool]`. However, the typechecker actually infers `Ts = Tuple[int, Union[str, bool]]`, which is a perfectly valid solution. This is sound but unintuitive.
This is analogous to what we do for the non-variadic case:
``` def foo(xs: T, ys: T) -> T: ...
some_int: int some_str: str foo(some_int, some_bool) ```
This is not a type error because `T = Union[int, str]` is a valid solution for T. (Mypy infers `T = object`, but it's the same principle.)
Note, however, that we *do* raise an error with Tensor:
``` def bar(x: Tensor[*Ts], y: Tensor[*Ts]) -> Tensor[*Ts]: ...
xs: Tensor[int, str] ys: Tensor[bool, bool] # Error bar(xs, ys) ```
This is because Tensor is invariant whereas Tuple is covariant.
(FWIW, TypeScript doesn't do the above for either the variadic or non-variadic case. It raises an error if T is passed different types of arguments.)
The Tuple covariance means we don't error in cases where we would expect to. Perhaps we can define variadic tuples as invariant by default.
(5) What if there is an arity mismatch?
Consider the following case (the same case as Eric pointed out :) ).
``` def foo(xs: Tuple[*Ts], ys: Tuple[*Ts]) -> Tuple[*Ts]: ...
tuple_int_str: Tuple[int, str] tuple_bool: Tuple[bool] foo(tuple_int_str, tuple_bool) ```
We might expect this to error because the argument types have different lengths.
However, Ts = Union[Tuple[int, str], Tuple[bool]] is a valid solution, since `Tuple` is covariant.
`foo` gets treated as:
``` def foo(xs: Union[Tuple[int, str], Tuple[bool]], ys: Union[Tuple[int, str], Tuple[bool]]) -> Union[Tuple[int, str], Tuple[bool]]: ... ```
Users might be expecting this to error and might be taken aback, as I was when I tried it out.
I experimented with disallowing a variadic `Ts` from being inferred as having two different lengths and that seemed somewhat more intuitive than the above. Opinions appreciated.
# Open questions
(1) What to do about `*Tuple[Any, ...]`?
During the last tensor meeting, we discussed allowing `Tensor[Any, ...]` (and the equivalent `Tensor`) in order to aid gradual typing.
Existing code annotated as `t: Tensor` would treat `Tensor` without parameters as `Tensor[Any, ...]`. That would be a Tensor with arbitrary rank and `Any` as the dimension type. This way, changing `class Tensor` to be a variadic wouldn't immediately break existing code.
I'm yet to implement this, so I'll look into how this affects type inference.
The same goes for `*Iterable[int]`, if indeed that is feasible.
(2) What to do about `*Union[...]`?
If `Ts` is a type variable bound by `tuple`, then `Ts = Union[Tuple[int, str], Tuple[bool]]` is a valid assignment. We then have to consider what unpacking that means.
TypeScript allows this:
When the type argument for T is a union type, the union is spread over the tuple type. For example, [A, ...T, B] instantiated with X | Y | Z as the type argument for T yields a union of instantiations of [A, ...T, B] with X, Y and Z as the type argument for T respectively.
******
I'll think about these questions a bit more over the next couple of weeks and update the PEP. We can discuss these in detail during the next tensor typing meeting.
Best,
On Mon, Jan 25, 2021 at 9:44 AM Eric Traut <eric@traut.com> wrote:
I agree that "variadic" is a term that casual Python coders may be unfamiliar with, but it's a pretty standard term used in other languages, as opposed to "covariant" and "contravariant", which I had never encountered prior to Python. I also don't think variadic type variables will be used by a typical Python coder. It's a pretty advanced feature. Most Python coders don't use simple (non-variadic) type variables today.
Based on your experiments in Pyright so far, how difficult would introducing the new grammar be?
Introducing the grammar change to allow the star operator within a subscript is easy, just a few dozen lines of new code. The difficult part is with all the error cases this introduces. The star operator is allowed only in the case of variadic type variables. All other uses of a star operator within a subscript are not allowed. A few of these cases can be detected and reported by the parser (e.g. when used in conjunction with slice expressions), but most require semantic information to detect, so the checks will need to be added in many places within the type checker — and presumably the runtime as well. When a new construct introduces many ways to produce new error conditions, my natural instinct is to look for a way to eliminate the possibility of those errors rather than trying to enumerate and plug each of them individually.
The star operator will also require changes beyond the parser and type checker. It will also require updates to completion suggestion and signature help logic.
This is all doable, but it adds significant work across many code bases and will result in many more bugs as we work out all of the kinks and edge cases. I'm not convinced that the readability benefits justify the added complexity. I think naming conventions could work fine here. After all, we've adopted naming conventions to designate covariant and contravariant type variables, and that seems to work fine.
I'm continuing to work on the implementation in pyright (currently on a private branch). Of course, none of this is set in stone — I'm just trying to inform the discussion. Once I get a critical mass of functionality working, I'll merge the changes and give you a chance to play with them in Pyright. I find that it helps to be able to write real code with real tooling when playing with new language constructs.
Here's what I have implemented so far: * Support for "variadic=True" in TypeVar constructor. * Support for a variadic TypeVar used at the end of a generic class declaration * Support for subscripts within type expressions that contain an arbitrary number of type arguments and matching of those type arguments to type parameters when the last type parameter is a variadic * Support for "()" (empty tuple) notation when used with variadic TypeVar * Support for "*args: Ts" matching * Support for zero-length matching
What I haven't done yet: * Reporting error for bound, variance, or constraints used in conjunction with variadic TypeVar * Reporting errors for situations where a variadic TypeVar is used in cases where it shouldn't be * Reporting errors for situations where a variadic TypeVar is not used in cases where it is needed * Detecting and reporting errors for variadic TypeVar when it's not at the end of a list of TypeVars in a generic class declaration * Detecting and reporting errors for multiple variadic TypeVars appearing in a generic class declaration * Support for Union[Ts] * Support for Tuple[Ts] * Support for Concatenate[x, y, Ts] * Variadics in generic type aliases * Support for open-ended (arbitrary-length) variadics * Tests for all of the above
I've run across a few additional questions:
1. PEP 484 indicates that if a type argument is omitted from a generic type, that type argument is assumed to be `Any`. What is the assumption with a variadic TypeVar? Should it default to `()` (empty tuple)? If we support open-ended tuples, then we could also opt for `(Any, ...)`.
2. What is the type of `def foo(*args: Ts) -> Union[Ts]` if foo is called with no arguments? In other words, what is the type of `Union[*()]`? Is it `Any`? Is this considered an error?
3. When the constraint solver is solving for a variadic type variable, does it need to solve for the individual elements of the tuple independently? Consider, for example, `def foo(a: Tuple[Ts], b: Tuple[Ts]) -> Tuple[Ts]`. Now, let's consider the expression `foo((3, "hi"), ("hi", 5.6))`? Would this be an error? Or would you expect that the constraint solver produce an answer of `Tuple[int | str, str | float]` (or `Tuple[object, object]`)? It's much easier to implement if we can treat this as an error, but I don't know if that satisfies the use cases you have in mind.
4. Along the lines of the previous question, consider the expression `foo((3, "hi"), ("hi", ))`. In this case, the lengths of the tuples don't match. If we don't support open-ended variadics, this needs to be an error. If we support open-ended variadics, we have the option of solving this as `Tuple[int | str, ...]` (or `Tuple[object, ...]`). Once again, it's easiest if we don't allow this and treat it as an error.
-- Eric Traut Contributor to Pyright and Pylance Microsoft Corp. _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: gohanpra@gmail.com
-- S Pradeep Kumar _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: rechen@google.com
(Last response for the night.) On Mon, Jan 25, 2021 at 10:58 AM S Pradeep Kumar <gohanpra@gmail.com> wrote:
[...] (5) What if there is an arity mismatch?
Consider the following case (the same case as Eric pointed out :) ).
``` def foo(xs: Tuple[*Ts], ys: Tuple[*Ts]) -> Tuple[*Ts]: ...
tuple_int_str: Tuple[int, str] tuple_bool: Tuple[bool] foo(tuple_int_str, tuple_bool) ```
We might expect this to error because the argument types have different lengths.
However, Ts = Union[Tuple[int, str], Tuple[bool]] is a valid solution, since `Tuple` is covariant.
`foo` gets treated as:
``` def foo(xs: Union[Tuple[int, str], Tuple[bool]], ys: Union[Tuple[int, str], Tuple[bool]]) -> Union[Tuple[int, str], Tuple[bool]]: ... ```
Users might be expecting this to error and might be taken aback, as I was when I tried it out.
I experimented with disallowing a variadic `Ts` from being inferred as having two different lengths and that seemed somewhat more intuitive than the above. Opinions appreciated.
The issue here is in general if you want to solve to Union or not. In mypy we generally don't, but then we end up solving to object. However here we can't solve to object (it must at least be a Tuple) so I like the error. # Open questions
(1) What to do about `*Tuple[Any, ...]`?
During the last tensor meeting, we discussed allowing `Tensor[Any, ...]` (and the equivalent `Tensor`) in order to aid gradual typing.
Existing code annotated as `t: Tensor` would treat `Tensor` without parameters as `Tensor[Any, ...]`. That would be a Tensor with arbitrary rank and `Any` as the dimension type. This way, changing `class Tensor` to be a variadic wouldn't immediately break existing code.
I'm yet to implement this, so I'll look into how this affects type inference.
The same goes for `*Iterable[int]`, if indeed that is feasible.
As I wrote, the default is actually as many copies of Any as are needed to make the type valid. But the `...` notation is *only* valid for Tuple, not for any other generic classes, so that syntax is not literally valid. However, I agree that this is what omitter parameters should be taken to mean.
(2) What to do about `*Union[...]`?
If `Ts` is a type variable bound by `tuple`, then `Ts = Union[Tuple[int, str], Tuple[bool]]` is a valid assignment. We then have to consider what unpacking that means.
TypeScript allows this:
When the type argument for T is a union type, the union is spread over the tuple type. For example, [A, ...T, B] instantiated with X | Y | Z as the type argument for T yields a union of instantiations of [A, ...T, B] with X, Y and Z as the type argument for T respectively.
Okay, so it means `[A, ...X, B] | [A, ...Y, B] | [A, ...Z, B]`. And I guess your (slightly cryptic, or condensed) question was about the instantiation of Ts from a union of tuples. It makes sense that a distributive law applies here. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
Too many messages indeed :) To the extent that I get to break stalemates by being The Main Author Of The PEP: ***Packed vs unpacked*** OK, let's go with a variadic type variable referring to the packed version by default (e.g. `Ts == Tuple[T1, T2, ...]`) and using a star to unpack. Readability Counts and Explicit Is Better Than Implicit. I **think** it should also be fine if we only allow unpacked usages (that is, all usages of variadic type variable instances must be starred), at least as of this PEP. Will update once I've tried this in the PEP. ***Constructor*** Let's go with `TypeVarTuple`. The crux for me is still that we might want to implement special behaviour for bounds/constraints/variance later on, and using a different constructor gives us more flexibility. (I'm happy to re-open discussion on these two things if we find important new arguments, but I also don't want us to bikeshed **too** much - so by default I'd like to consider these closed.) ***Implementation*** Sorry, Pradeep, for forgetting to mention you'd been working on an implementation in Pyre. Absolutely cracking work - especially going above and beyond to work on it over a weekend! And you too, Eric - this is super helpful stuff, thank you! And thanks for clarifying, Rebecca :) ***Open-endedness*** That is, should it be possible to bind `Ts` to e.g. `Tuple[int, ...]`? Guido, you're right: Eric raised this question in the thread at https://mail.python.org/archives/list/typing-sig@python.org/thread/SQVTQYWIO.... I originally said no, because it seemed too complicated: a) In general, what would the result of unpacking `Ts` be? b) It introduces extra complications to the behaviour of `Union[*Ts]` - see the thread for more details. But now I'm wondering. It seems pretty crucial that `Ts` should be bound to `Tuple[Any, ...]` when no type parameters are specified (i.e. `class Tensor(Generic[*Shape]): ...; Tensor == Tensor[Any, ...]`). There's also Guido's argument that ``` def foo(*args: *T) -> Tuple[*T]: return args def bar(*x: *int): y = foo(*x) ``` should be perfectly fine at runtime, so it would be weird if the type checker disallowed it. Then again, in Eric's example: ``` def foo(a: Tuple[*Ts], b: Tuple[*Ts]): ... foo((3, "hi"), ("hi", )) ``` My intuition is strongly that this should be an error. Maybe I'm saying this just because I care most about the numerical computing use-case? ``` def both_arguments_must_have_same_rank(x1: Tensor[*Shape], x2: Tensor[*Shape]) -> Tensor[*Shape]: ... x1: Tensor[Batch] x2: Tensor[Time, Batch] both_arguments_must_have_same_rank(x1, x2) # NOOOOOO ``` So overall I lean towards saying, no, we shouldn't allow it - `Tuple[Any, ...]` is the single exception. (Anyway, if we do find use-cases for open-endedness which are important, we can add it in a later PEP, right?) ***Specific questions*** I'm also collecting everyone's responses to these at https://docs.google.com/document/d/1MhVBFRtqVSnFqpeEu9JmZSohZE1mELOw04wcObwT... so we have a central point of reference for all the arguments relevant to each question. I'll also clarify these in the PEP.
In the expression `x[A, *B]`, what value will be passed to the ` __getitem__` method for instance `x`? Will the runtime effectively replace ` *B` with `typing.Unpack[B]`?
Given that we're intending `Unpack` to be a stopgap, I'd feel uncomfortable relying on it. My first instinct would be to pass the `TypeVarTuple` instance with an attribute `B._unpacked = True`. That would preserve the most information, giving `x` access to everything inside the `TypeVarTuple`.
Will star expressions be allowed in slice expressions?
I concur: nope.
Am I correct in assuming that it's not OK to pass zero arguments to a ` *args` parameter that has a variadic TypeVar annotation (`*args: *Ts`)?
I agree with Guido: it **should** be valid to pass zero arguments to `*args: *Ts`. Generally, it **should** be valid for a `TypeVarTuple` to be empty: we should be able to represent rank-0 tensors (that is, a scalar, which in TensorFlow and NumPy can still be an array object), and the natural candidate is `Tensor[()]`.
I presume that it should be an error to pass an arbitrary-length list of arguments to an `*args` parameter if it has a variadic `TypeVar` annotation?
PEP 484 indicates that if a type argument is omitted from a generic type,
In line with my tentative view on open-endedness above, I think that yes, this should be an error. that type argument is assumed to be `Any`. What is the assumption with a variadic `TypeVar`? Should it default to `()` (empty tuple)? If we support open-ended tuples, then we could also opt for `(Any, ...)`. As Guido and Pradeep say, I also think `(Any, ...)` is the right choice.
What is the type of `def foo(*args: Ts) -> Union[Ts]` if `foo` is called with no arguments? In other words, what is the type of `Union[*()]`? Is it ` Any`? Is this considered an error?
When the constraint solver is solving for a variadic type variable, does it need to solve for the individual elements of the tuple independently? Consider, for example, `def foo(a: Tuple[Ts], b: Tuple[Ts]) -> Tuple[Ts]`. Now, let's consider the expression `foo((3, "hi"), ("hi", 5.6))`? Would
This should be an error, following the behaviour of `Union` at runtime (try doing `Union[()]`). this be an error? Or would you expect that the constraint solver produce an answer of `Tuple[int | str, str | float]` (or `Tuple[object, object]`)? It's much easier to implement if we can treat this as an error, but I don't know if that satisfies the use cases you have in mind. I think this should be an error, so that my ` both_arguments_must_have_same_rank` example works. Will think about the rest tomorrow :) On Tue, 26 Jan 2021 at 06:05, Guido van Rossum <guido@python.org> wrote:
(Last response for the night.)
On Mon, Jan 25, 2021 at 10:58 AM S Pradeep Kumar <gohanpra@gmail.com> wrote:
[...] (5) What if there is an arity mismatch?
Consider the following case (the same case as Eric pointed out :) ).
``` def foo(xs: Tuple[*Ts], ys: Tuple[*Ts]) -> Tuple[*Ts]: ...
tuple_int_str: Tuple[int, str] tuple_bool: Tuple[bool] foo(tuple_int_str, tuple_bool) ```
We might expect this to error because the argument types have different lengths.
However, Ts = Union[Tuple[int, str], Tuple[bool]] is a valid solution, since `Tuple` is covariant.
`foo` gets treated as:
``` def foo(xs: Union[Tuple[int, str], Tuple[bool]], ys: Union[Tuple[int, str], Tuple[bool]]) -> Union[Tuple[int, str], Tuple[bool]]: ... ```
Users might be expecting this to error and might be taken aback, as I was when I tried it out.
I experimented with disallowing a variadic `Ts` from being inferred as having two different lengths and that seemed somewhat more intuitive than the above. Opinions appreciated.
The issue here is in general if you want to solve to Union or not. In mypy we generally don't, but then we end up solving to object. However here we can't solve to object (it must at least be a Tuple) so I like the error.
# Open questions
(1) What to do about `*Tuple[Any, ...]`?
During the last tensor meeting, we discussed allowing `Tensor[Any, ...]` (and the equivalent `Tensor`) in order to aid gradual typing.
Existing code annotated as `t: Tensor` would treat `Tensor` without parameters as `Tensor[Any, ...]`. That would be a Tensor with arbitrary rank and `Any` as the dimension type. This way, changing `class Tensor` to be a variadic wouldn't immediately break existing code.
I'm yet to implement this, so I'll look into how this affects type inference.
The same goes for `*Iterable[int]`, if indeed that is feasible.
As I wrote, the default is actually as many copies of Any as are needed to make the type valid. But the `...` notation is *only* valid for Tuple, not for any other generic classes, so that syntax is not literally valid. However, I agree that this is what omitter parameters should be taken to mean.
(2) What to do about `*Union[...]`?
If `Ts` is a type variable bound by `tuple`, then `Ts = Union[Tuple[int, str], Tuple[bool]]` is a valid assignment. We then have to consider what unpacking that means.
TypeScript allows this:
When the type argument for T is a union type, the union is spread over the tuple type. For example, [A, ...T, B] instantiated with X | Y | Z as the type argument for T yields a union of instantiations of [A, ...T, B] with X, Y and Z as the type argument for T respectively.
Okay, so it means `[A, ...X, B] | [A, ...Y, B] | [A, ...Z, B]`. And I guess your (slightly cryptic, or condensed) question was about the instantiation of Ts from a union of tuples. It makes sense that a distributive law applies here.
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/> _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: mrahtz@google.com
On Tue, Jan 26, 2021 at 1:29 PM Matthew Rahtz via Typing-sig < typing-sig@python.org> wrote:
[...] ***Open-endedness***
That is, should it be possible to bind `Ts` to e.g. `Tuple[int, ...]`?
Guido, you're right: Eric raised this question in the thread at https://mail.python.org/archives/list/typing-sig@python.org/thread/SQVTQYWIO... .
That links to a whole thread. I trust that Eric asked about this somewhere down the thread. :-)
I originally said no, because it seemed too complicated: a) In general, what would the result of unpacking `Ts` be? b) It introduces extra complications to the behaviour of `Union[*Ts]` - see the thread for more details.
Why do you care about the result of unpacking? That seems to be an artifact of how you implement type checking. But isn't type checking all about manipulating abstractions? ISTM that you can define all important operations on this just fine (Union[*Ts] would be Union[int, int, int, . . .] which is clearly int).
But now I'm wondering. It seems pretty crucial that `Ts` should be bound to `Tuple[Any, ...]` when no type parameters are specified (i.e. `class Tensor(Generic[*Shape]): ...; Tensor == Tensor[Any, ...]`). There's also Guido's argument that
``` def foo(*args: *T) -> Tuple[*T]: return args def bar(*x: *int): y = foo(*x) ```
should be perfectly fine at runtime, so it would be weird if the type checker disallowed it.
Then again, in Eric's example:
``` def foo(a: Tuple[*Ts], b: Tuple[*Ts]): ...
foo((3, "hi"), ("hi", )) ```
My intuition is strongly that this should be an error. Maybe I'm saying this just because I care most about the numerical computing use-case?
Did you just change the subject? That (Eric's) example doesn't seem to have anything to do with Tuple[int, ...]. And it is no stranger than this: ``` def foo(a: T, b: T): ... foo(3, "hi") # T becomes object ``` ```
def both_arguments_must_have_same_rank(x1: Tensor[*Shape], x2: Tensor[*Shape]) -> Tensor[*Shape]: ...
x1: Tensor[Batch] x2: Tensor[Time, Batch] both_arguments_must_have_same_rank(x1, x2) # NOOOOOO ```
I thought I explained that in one of my messages last night: I'd prefer it if I could think of e.g. `def foo(a: Tuple[Ts], b:
Tuple[Ts])` as a series of overloads including `def foo(a: Tuple[T1, T2], b: Tuple[T1, T2])`. That should answer the question, right? Ts stands for `(T1, T2, …, Tn)` for some n (we seem to have an issue about whether n can be zero). If different checkers produce different answers for the latter, e.g. due to different attitudes about unions, that's okay, but checkers should be consistent with themselves.
Okay, reading that back it's less clear than I remembered it, but I'm basically arguing that the tuples need to have the same length. (And that should be provable, so e.g. with concatenation we agree that `[int, *Ts] == [int, *Ts]` but we treat `[int, *Ts1]` and `[int, *Ts2]` as different.) So overall I lean towards saying, no, we shouldn't allow it - `Tuple[Any,
...]` is the single exception. (Anyway, if we do find use-cases for open-endedness which are important, we can add it in a later PEP, right?)
Sure, so I'm okay if you add words to the PEP explicitly stating that we're ruling out such cases because they appear too complicated to implement.
***Specific questions***
I'm also collecting everyone's responses to these at https://docs.google.com/document/d/1MhVBFRtqVSnFqpeEu9JmZSohZE1mELOw04wcObwT... so we have a central point of reference for all the arguments relevant to each question. I'll also clarify these in the PEP.
In the expression `x[A, *B]`, what value will be passed to the ` __getitem__` method for instance `x`? Will the runtime effectively replace `*B` with `typing.Unpack[B]`?
Given that we're intending `Unpack` to be a stopgap, I'd feel uncomfortable relying on it. My first instinct would be to pass the ` TypeVarTuple` instance with an attribute `B._unpacked = True`. That would preserve the most information, giving `x` access to everything inside the `TypeVarTuple`.
I would recommend creating a helper class (at runtime) that wraps the original TypeVar. Here's a prototype (with a dummy class TypeVar): ``` class TypeVar: def __init__(self, name): self.name = name def __iter__(self): yield TypeVarIter(self) class TypeVarIter: def __init__(self, tv): self.tv = tv def __repr__(self): return f"*{self.tv.name}" Ts = TypeVar("Ts") a = tuple[(int, *Ts)] print(a) # tuple[int, *Ts] ``` [...]
What is the type of `def foo(*args: Ts) -> Union[Ts]` if `foo` is called with no arguments? In other words, what is the type of `Union[*()]`? Is it `Any`? Is this considered an error?
This should be an error, following the behaviour of `Union` at runtime (try doing `Union[()]`).
Or (as I said last night) we could return NoReturn. (Honestly I think Union[()] could return that too -- I believe that the typing.py module is overzealous in the amount of error checking it is doing.)
When the constraint solver is solving for a variadic type variable, does it need to solve for the individual elements of the tuple independently? Consider, for example, `def foo(a: Tuple[Ts], b: Tuple[Ts]) -> Tuple[Ts]`. Now, let's consider the expression `foo((3, "hi"), ("hi", 5.6))`? Would this be an error? Or would you expect that the constraint solver produce an answer of `Tuple[int | str, str | float]` (or `Tuple[object, object]`)? It's much easier to implement if we can treat this as an error, but I don't know if that satisfies the use cases you have in mind.
I think this should be an error, so that my ` both_arguments_must_have_same_rank` example works.
But see my remark above -- I think it could use Tuple[object, object] or perhaps a tuple of two unions. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
***Reply to Guido***
Why do you care about the result of unpacking?
Thinking this through...I guess my main concern would be the issue that I think Eric brought up, concatenation. I'm reasonably convinced it could be done in a way that was sound, but it would be another detail to add to the PEP, which I'm trying to keep minimal. Also, now that I think about it, if we did allow concatenation, I'm not sure the ellipsis notation would remain intuitive: if we had e.g. `Tuple[int, ..., str]`, my personal intuition is that it would mean "An `int`, zero of more arbitrary types, then a `str`" rather than "An arbitrary number of `int`s, then a `str`".
ISTM that you can define all important operations on this just fine
Eric pointed out at https://mail.python.org/archives/list/typing-sig@python.org/message/25INUZ5K... that we'd need 2 rules: if `Ts == Tuple[T, ...]`, then `Union[*Ts]` collapses to `T`, whereas if `Ts == Tuple[T1, T2, T3]` then `Union[*Ts]` collapses to a union of the individual subtypes with literals stripped. Here too, although this seems sound, it's more just that I don't want to add too many of these kinds of subtle details to at least this initial PEP.
Did you just change the subject?
Oh, sorry, I see how that was unclear. The connection was that, if we did allow 'open-ended' type variables, then it's possible for Eric's example ``` def foo(a: Tuple[*Ts], b: Tuple[*Ts]): ... foo((3, "hi"), ("hi", )) ``` to type-check fine with `Ts == Tuple[int | str, ...]`.
but I'm basically arguing that the tuples need to have the same length
Cool, I agree :)
Or (as I said last night) we could return NoReturn.
*shrug* Sounds fine.
But see my remark above -- I think it could use `Tuple[object, object]` or perhaps a tuple of two unions.
Oh, I see what you mean. I'm not sure how to phrase this - I'm a bit of my depth when it comes to constraint-solving - but what I'd intuitively expect to happen is for type-checker to see the first argument, bind `Ts` to `Tuple[int, str]`, then see the second argument, realise its type is different than what `Ts` has already been bound to, and throw an error. Is it possible to set things up like this? Or are there reasons that in general we should try and solve for the most general type possible? Actually, gosh, this is really worth clarifying. In your example: ``` def foo(a: T, b: T): ... foo(3, "hi") # T becomes object ``` I see what you mean - that `T` being `object` is a valid solution to this - but at the same time it horrifies me - it definitely doesn't feel like that should be a solution. If I had written that code, I'd be trying to enforce that `a` and `b` are the same type. It looks like pytype complains about this example, but Mypy doesn't (`reveal_type` says `a` and `b` are both 'T`-1'; not sure what that means). ***Eric/Pradeep***, what's the right way to phrase this in the PEP? Is the intuitive behaviour I'm gesturing to what "solving for the individual element of the tuple independently" means? On Wed, 27 Jan 2021 at 00:56, Guido van Rossum <guido@python.org> wrote:
On Tue, Jan 26, 2021 at 1:29 PM Matthew Rahtz via Typing-sig < typing-sig@python.org> wrote:
[...] ***Open-endedness***
That is, should it be possible to bind `Ts` to e.g. `Tuple[int, ...]`?
Guido, you're right: Eric raised this question in the thread at https://mail.python.org/archives/list/typing-sig@python.org/thread/SQVTQYWIO... .
That links to a whole thread. I trust that Eric asked about this somewhere down the thread. :-)
I originally said no, because it seemed too complicated: a) In general, what would the result of unpacking `Ts` be? b) It introduces extra complications to the behaviour of `Union[*Ts]` - see the thread for more details.
Why do you care about the result of unpacking? That seems to be an artifact of how you implement type checking. But isn't type checking all about manipulating abstractions? ISTM that you can define all important operations on this just fine (Union[*Ts] would be Union[int, int, int, . . .] which is clearly int).
But now I'm wondering. It seems pretty crucial that `Ts` should be bound to `Tuple[Any, ...]` when no type parameters are specified (i.e. `class Tensor(Generic[*Shape]): ...; Tensor == Tensor[Any, ...]`). There's also Guido's argument that
``` def foo(*args: *T) -> Tuple[*T]: return args def bar(*x: *int): y = foo(*x) ```
should be perfectly fine at runtime, so it would be weird if the type checker disallowed it.
Then again, in Eric's example:
``` def foo(a: Tuple[*Ts], b: Tuple[*Ts]): ...
foo((3, "hi"), ("hi", )) ```
My intuition is strongly that this should be an error. Maybe I'm saying this just because I care most about the numerical computing use-case?
Did you just change the subject? That (Eric's) example doesn't seem to have anything to do with Tuple[int, ...]. And it is no stranger than this:
``` def foo(a: T, b: T): ...
foo(3, "hi") # T becomes object ```
```
def both_arguments_must_have_same_rank(x1: Tensor[*Shape], x2: Tensor[*Shape]) -> Tensor[*Shape]: ...
x1: Tensor[Batch] x2: Tensor[Time, Batch] both_arguments_must_have_same_rank(x1, x2) # NOOOOOO ```
I thought I explained that in one of my messages last night:
I'd prefer it if I could think of e.g. `def foo(a: Tuple[Ts], b:
Tuple[Ts])` as a series of overloads including `def foo(a: Tuple[T1, T2], b: Tuple[T1, T2])`. That should answer the question, right? Ts stands for `(T1, T2, …, Tn)` for some n (we seem to have an issue about whether n can be zero). If different checkers produce different answers for the latter, e.g. due to different attitudes about unions, that's okay, but checkers should be consistent with themselves.
Okay, reading that back it's less clear than I remembered it, but I'm basically arguing that the tuples need to have the same length. (And that should be provable, so e.g. with concatenation we agree that `[int, *Ts] == [int, *Ts]` but we treat `[int, *Ts1]` and `[int, *Ts2]` as different.)
So overall I lean towards saying, no, we shouldn't allow it - `Tuple[Any,
...]` is the single exception. (Anyway, if we do find use-cases for open-endedness which are important, we can add it in a later PEP, right?)
Sure, so I'm okay if you add words to the PEP explicitly stating that we're ruling out such cases because they appear too complicated to implement.
***Specific questions***
I'm also collecting everyone's responses to these at https://docs.google.com/document/d/1MhVBFRtqVSnFqpeEu9JmZSohZE1mELOw04wcObwT... so we have a central point of reference for all the arguments relevant to each question. I'll also clarify these in the PEP.
In the expression `x[A, *B]`, what value will be passed to the ` __getitem__` method for instance `x`? Will the runtime effectively replace `*B` with `typing.Unpack[B]`?
Given that we're intending `Unpack` to be a stopgap, I'd feel uncomfortable relying on it. My first instinct would be to pass the ` TypeVarTuple` instance with an attribute `B._unpacked = True`. That would preserve the most information, giving `x` access to everything inside the `TypeVarTuple`.
I would recommend creating a helper class (at runtime) that wraps the original TypeVar. Here's a prototype (with a dummy class TypeVar): ``` class TypeVar: def __init__(self, name): self.name = name
def __iter__(self): yield TypeVarIter(self)
class TypeVarIter: def __init__(self, tv): self.tv = tv
def __repr__(self): return f"*{self.tv.name}"
Ts = TypeVar("Ts") a = tuple[(int, *Ts)] print(a) # tuple[int, *Ts] ```
[...]
What is the type of `def foo(*args: Ts) -> Union[Ts]` if `foo` is called with no arguments? In other words, what is the type of `Union[*()]`? Is it `Any`? Is this considered an error?
This should be an error, following the behaviour of `Union` at runtime (try doing `Union[()]`).
Or (as I said last night) we could return NoReturn. (Honestly I think Union[()] could return that too -- I believe that the typing.py module is overzealous in the amount of error checking it is doing.)
When the constraint solver is solving for a variadic type variable, does it need to solve for the individual elements of the tuple independently? Consider, for example, `def foo(a: Tuple[Ts], b: Tuple[Ts]) -> Tuple[Ts]`. Now, let's consider the expression `foo((3, "hi"), ("hi", 5.6))`? Would this be an error? Or would you expect that the constraint solver produce an answer of `Tuple[int | str, str | float]` (or `Tuple[object, object]`)? It's much easier to implement if we can treat this as an error, but I don't know if that satisfies the use cases you have in mind.
I think this should be an error, so that my ` both_arguments_must_have_same_rank` example works.
But see my remark above -- I think it could use Tuple[object, object] or perhaps a tuple of two unions.
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
On Wed, Jan 27, 2021 at 4:43 AM Matthew Rahtz <mrahtz@google.com> wrote:
***Reply to Guido***
Why do you care about the result of unpacking?
Thinking this through...I guess my main concern would be the issue that I think Eric brought up, concatenation. I'm reasonably convinced it could be done in a way that was sound, but it would be another detail to add to the PEP, which I'm trying to keep minimal.
Sure. But you do have to state explicitly that this is not expected to work.
Also, now that I think about it, if we did allow concatenation, I'm not sure the ellipsis notation would remain intuitive: if we had e.g. `Tuple[int, ..., str]`, my personal intuition is that it would mean "An `int`, zero of more arbitrary types, then a `str`" rather than "An arbitrary number of `int`s, then a `str`".
But then your intuition would be wrong for the meaning of `Tuple[int, ...]` as well. (In two ways actually -- the empty tuple is a member of that type, and the `...` stands for 0 or more times `int`.)
ISTM that you can define all important operations on this just fine
Eric pointed out at https://mail.python.org/archives/list/typing-sig@python.org/message/25INUZ5K... that we'd need 2 rules: if `Ts == Tuple[T, ...]`, then `Union[*Ts]` collapses to `T`, whereas if `Ts == Tuple[T1, T2, T3]` then `Union[*Ts]` collapses to a union of the individual subtypes with literals stripped. Here too, although this seems sound, it's more just that I don't want to add too many of these kinds of subtle details to at least this initial PEP.
Subtle details is why we have PEPs. :-) The weirdness is really that literals are stripped (which I presume means widened to their base type, i.e. `Literal[1]` -> `int` and `Literal[1, ""]` -> `Union[int, str]`).
Did you just change the subject?
Oh, sorry, I see how that was unclear. The connection was that, if we did allow 'open-ended' type variables, then it's possible for Eric's example
``` def foo(a: Tuple[*Ts], b: Tuple[*Ts]): ...
foo((3, "hi"), ("hi", )) ```
to type-check fine with `Ts == Tuple[int | str, ...]`.
but I'm basically arguing that the tuples need to have the same length
Cool, I agree :)
Or (as I said last night) we could return NoReturn.
*shrug* Sounds fine.
But see my remark above -- I think it could use `Tuple[object, object]` or perhaps a tuple of two unions.
Oh, I see what you mean.
I'm not sure how to phrase this - I'm a bit of my depth when it comes to constraint-solving - but what I'd intuitively expect to happen is for type-checker to see the first argument, bind `Ts` to `Tuple[int, str]`, then see the second argument, realise its type is different than what `Ts` has already been bound to, and throw an error. Is it possible to set things up like this? Or are there reasons that in general we should try and solve for the most general type possible?
No, that's not how type variables work at all! When a type checker sees multiple arguments using the same type variable it does a "solve" operation which tries to find a common type. That common type may well be object, and in some type checkers (not mypy) it may be a union. The solve operation is important for class hierarchies, e.g. ``` class A: ... class B(A): ... class C(A): ... T = TypeVar("T") def f(a: T, b: T) -> T: ... x = f(B(), C()) # inferred type is A ``` To avoid this you have to use `TypeVar(bound=...)` . This is another reason why I'm not too happy with leaving out all function bodies in examples -- if you have a function like f() in this example, there's not much you can do with the arguments other than print them. E.g. if you wanted to return a+b, you would have to specify as the bound a protocol or concrete type that defines an `__add__` operation, and then a call like `f(1, "")` would be rejected.
Actually, gosh, this is really worth clarifying. In your example:
``` def foo(a: T, b: T): ...
foo(3, "hi") # T becomes object ```
I see what you mean - that `T` being `object` is a valid solution to this - but at the same time it horrifies me - it definitely doesn't feel like that should be a solution. If I had written that code, I'd be trying to enforce that `a` and `b` are the same type. It looks like pytype complains about this example, but Mypy doesn't (`reveal_type` says `a` and `b` are both 'T`-1'; not sure what that means).
Mypy's reveal_type() acts on the function before type variables are expanded, so it just says that they have the same type. The index is to avoid ambiguities in case there are somehow distinct type variables with the same name. (I believe type variables used for class definitions have positive indexes and those defined in function definitions have negative indexes.) -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
***Reply to Pradeep***
This is not a type error because `T = Union[int, str]` is a valid solution for `T`. (Mypy infers `T = object`, but it's the same principle.)
Woah - yeah, I find this super unintuitive. It sounds like Mypy and Pyre "solve for the most general possible type"? No, that's not right - the most general possible type would be `Any`. "They're willing to find solutions that are more general than any of the specific types involved"? Whereas pytype...gosh, I don't even know what the right jargon is. ***Rebecca***, what *is* pytype doing in an example like this, such that it just doesn't infer `T = Union[int, str]`? (Reproduced below for less scrolling) ``` def foo(xs: T, ys: T) -> T: ... some_int: int some_str: str foo(some_int, some_bool) ```
This is because Tensor is invariant whereas Tuple is covariant.
I experimented with disallowing a variadic `Ts` from being inferred as having two different lengths and that seemed somewhat more intuitive than
What's the connection between variance and what kind of type is inferred? In any case, I think defining variadic type variables as invariant is a reasonable default for this PEP. the above. Opinions appreciated. Yeah, iiuc, Guido and I are both in agreement all occurrences of `Ts` should have the same length within one signature.
If `Ts` is a type variable bound by `tuple`, then `Ts = Union[Tuple[int, str], Tuple[bool]]` is a valid assignment.
Oh, interesting. Well, even though this is implicitly settled by the decision to go for `TypeVarTuple`, to be explicit: I think this should be disallowed (at least in this current PEP). On Mon, 25 Jan 2021 at 18:58, S Pradeep Kumar <gohanpra@gmail.com> wrote:
Can you clarify what "no concatenation of variadics" refers to? Does this mean we can't (yet) have `Tuple[int, *Ts]`? Or is that specifically about `Tuple[*Ts1, *Ts2]`. (And what about the same constructs inside `Callable[[<here>], R]`?
Guido: I mean concatenation of multiple variadics (`Tuple[*Ts, *Ts2]`, same within Callable).
The first PEP will support concatenation with a non-variadic prefix and suffix (`Tuple[int, *Ts, str]`, same within Callable).
But I've been learning about the corresponding functionality in TypeScript, and it uses `...X`, here which is JavaScript's and TypeScript's spelling of `*X` in various places. I also find it the `*` useful hint for the user: When I see `Tuple[*X]` I immediately know that the length of the tuple is variadic. If I were to see `Tuple[Ts]` I'd have to remember that the naming convention tells me that the tuple is variadic. Readability Counts.
I strongly agree. `*Ts` is a clear visual analogy to tuple unpacking.
# Reference implementation and examples
I wonder if any of the PEP authors have worked on an implementation yet
Yes, I have been working on a reference implementation from scratch in Pyre.
I added support over the weekend for concatenation of multiple variadics since that's one of the main points of uncertainty in our discussion. Here are some examples that typecheck on my branch:
(1) Basic usage of non-variadic prefix and suffix
``` from typing import List, Tuple, TypeVar
Ts = TypeVar("Ts", bound=tuple) Ts2 = TypeVar("Ts2", bound=tuple)
def strip_both_sides(x: Tuple[int, *Ts, str]) -> Ts: ...
def add_int(x: Tuple[*Ts]) -> Tuple[bool, *Ts, int]: ...
def foo(xs: Tuple[int, bool, str]) -> None: z = add_bool_int(strip_both_sides(xs))
# => Tuple[bool, bool, int] reveal_type(z) ```
(2) `partial` (using concatenation of multiple variadics):
``` from typing import Callable, Tuple, TypeVar
Ts = TypeVar("Ts", bound=tuple) Ts2 = TypeVar("Ts2", bound=tuple)
def partial(f: Callable[[*Ts, *Ts2], bool], *args: *Ts) -> Callable[[*Ts2], bool]: ...
def foo(x: int, y: str, z: bool) -> bool: ...
def baz() -> None: expects_bool = partial(foo, 1, "hello")
# Valid. expects_bool(True)
# Error. expects_bool()
# Note that this `partial` doesn't support keyword arguments. ```
Other notable features:
+ Variadic classes like Tensor. + `*args: *Ts` + Concatenation within parameters of Tuples, Callables, and variadic classes. + Allowing Tuples to be arbitrarily unpacked: `*Tuple[int, *Ts]`
Other test cases: https://github.com/pradeep90/pyre-check/blob/master/source/analysis/test/int...
Once we settle the debate about TypeVar vs TypeVarTuple and other syntax, I can work on merging this branch into Pyre master and implementing some other details from the PEP (such as `Union[*Ts]`, generic aliases, Map with generic aliases, etc.).
# Key findings
(1) Prefix and suffix work as expected
Handling a suffix of non-variadic parameters `Tuple[*Ts, int]` is essentially the same work as handling a prefix `Tuple[int, *Ts]`.
@Eric Could you share an example of a non-variadic suffix that you felt would be hard to tackle?
I'm basically following the TypeScript algorithm for inferring one variadic tuple type against another ( https://github.com/microsoft/TypeScript/pull/39094).
(2) Concatenation of multiple variadics
Concatenation of multiple variadics works as expected when the length of one is unambiguously known.
However, concatenation of multiple variadics is indeed significantly complex and should definitely *not* be part of the initial variadics PEP.
(3) Unpack[Ts] / *Ts didn't pose a big problem
I preprocessed `*Ts` to be `Unpack[Ts]` so that I could uniformly deal with `Unpack`. Pyre uses its own parser, so that's one thing to note.
Also, the `*` syntax is going to be implemented as part of PEP 637, so I don't think the implementation should be a concern for us. Until it is implemented, we do have `Unpack`.
(4) Surprising behavior: variance of Tuple vs Tensor
``` def foo(xs: Tuple[*Ts], ys: Tuple[*Ts]) -> Tuple[*Ts]: ...
tuple_int_str: Tuple[int, str] tuple_int_bool: Tuple[int, bool] foo(tuple_int_str, tuple_int_bool) ```
We might expect a type error here because `Tuple[int, str] != Tuple[int, bool]`. However, the typechecker actually infers `Ts = Tuple[int, Union[str, bool]]`, which is a perfectly valid solution. This is sound but unintuitive.
This is analogous to what we do for the non-variadic case:
``` def foo(xs: T, ys: T) -> T: ...
some_int: int some_str: str foo(some_int, some_bool) ```
This is not a type error because `T = Union[int, str]` is a valid solution for T. (Mypy infers `T = object`, but it's the same principle.)
Note, however, that we *do* raise an error with Tensor:
``` def bar(x: Tensor[*Ts], y: Tensor[*Ts]) -> Tensor[*Ts]: ...
xs: Tensor[int, str] ys: Tensor[bool, bool] # Error bar(xs, ys) ```
This is because Tensor is invariant whereas Tuple is covariant.
(FWIW, TypeScript doesn't do the above for either the variadic or non-variadic case. It raises an error if T is passed different types of arguments.)
The Tuple covariance means we don't error in cases where we would expect to. Perhaps we can define variadic tuples as invariant by default.
(5) What if there is an arity mismatch?
Consider the following case (the same case as Eric pointed out :) ).
``` def foo(xs: Tuple[*Ts], ys: Tuple[*Ts]) -> Tuple[*Ts]: ...
tuple_int_str: Tuple[int, str] tuple_bool: Tuple[bool] foo(tuple_int_str, tuple_bool) ```
We might expect this to error because the argument types have different lengths.
However, Ts = Union[Tuple[int, str], Tuple[bool]] is a valid solution, since `Tuple` is covariant.
`foo` gets treated as:
``` def foo(xs: Union[Tuple[int, str], Tuple[bool]], ys: Union[Tuple[int, str], Tuple[bool]]) -> Union[Tuple[int, str], Tuple[bool]]: ... ```
Users might be expecting this to error and might be taken aback, as I was when I tried it out.
I experimented with disallowing a variadic `Ts` from being inferred as having two different lengths and that seemed somewhat more intuitive than the above. Opinions appreciated.
# Open questions
(1) What to do about `*Tuple[Any, ...]`?
During the last tensor meeting, we discussed allowing `Tensor[Any, ...]` (and the equivalent `Tensor`) in order to aid gradual typing.
Existing code annotated as `t: Tensor` would treat `Tensor` without parameters as `Tensor[Any, ...]`. That would be a Tensor with arbitrary rank and `Any` as the dimension type. This way, changing `class Tensor` to be a variadic wouldn't immediately break existing code.
I'm yet to implement this, so I'll look into how this affects type inference.
The same goes for `*Iterable[int]`, if indeed that is feasible.
(2) What to do about `*Union[...]`?
If `Ts` is a type variable bound by `tuple`, then `Ts = Union[Tuple[int, str], Tuple[bool]]` is a valid assignment. We then have to consider what unpacking that means.
TypeScript allows this:
When the type argument for T is a union type, the union is spread over the tuple type. For example, [A, ...T, B] instantiated with X | Y | Z as the type argument for T yields a union of instantiations of [A, ...T, B] with X, Y and Z as the type argument for T respectively.
******
I'll think about these questions a bit more over the next couple of weeks and update the PEP. We can discuss these in detail during the next tensor typing meeting.
Best,
On Mon, Jan 25, 2021 at 9:44 AM Eric Traut <eric@traut.com> wrote:
I agree that "variadic" is a term that casual Python coders may be unfamiliar with, but it's a pretty standard term used in other languages, as opposed to "covariant" and "contravariant", which I had never encountered prior to Python. I also don't think variadic type variables will be used by a typical Python coder. It's a pretty advanced feature. Most Python coders don't use simple (non-variadic) type variables today.
Based on your experiments in Pyright so far, how difficult would introducing the new grammar be?
Introducing the grammar change to allow the star operator within a subscript is easy, just a few dozen lines of new code. The difficult part is with all the error cases this introduces. The star operator is allowed only in the case of variadic type variables. All other uses of a star operator within a subscript are not allowed. A few of these cases can be detected and reported by the parser (e.g. when used in conjunction with slice expressions), but most require semantic information to detect, so the checks will need to be added in many places within the type checker — and presumably the runtime as well. When a new construct introduces many ways to produce new error conditions, my natural instinct is to look for a way to eliminate the possibility of those errors rather than trying to enumerate and plug each of them individually.
The star operator will also require changes beyond the parser and type checker. It will also require updates to completion suggestion and signature help logic.
This is all doable, but it adds significant work across many code bases and will result in many more bugs as we work out all of the kinks and edge cases. I'm not convinced that the readability benefits justify the added complexity. I think naming conventions could work fine here. After all, we've adopted naming conventions to designate covariant and contravariant type variables, and that seems to work fine.
I'm continuing to work on the implementation in pyright (currently on a private branch). Of course, none of this is set in stone — I'm just trying to inform the discussion. Once I get a critical mass of functionality working, I'll merge the changes and give you a chance to play with them in Pyright. I find that it helps to be able to write real code with real tooling when playing with new language constructs.
Here's what I have implemented so far: * Support for "variadic=True" in TypeVar constructor. * Support for a variadic TypeVar used at the end of a generic class declaration * Support for subscripts within type expressions that contain an arbitrary number of type arguments and matching of those type arguments to type parameters when the last type parameter is a variadic * Support for "()" (empty tuple) notation when used with variadic TypeVar * Support for "*args: Ts" matching * Support for zero-length matching
What I haven't done yet: * Reporting error for bound, variance, or constraints used in conjunction with variadic TypeVar * Reporting errors for situations where a variadic TypeVar is used in cases where it shouldn't be * Reporting errors for situations where a variadic TypeVar is not used in cases where it is needed * Detecting and reporting errors for variadic TypeVar when it's not at the end of a list of TypeVars in a generic class declaration * Detecting and reporting errors for multiple variadic TypeVars appearing in a generic class declaration * Support for Union[Ts] * Support for Tuple[Ts] * Support for Concatenate[x, y, Ts] * Variadics in generic type aliases * Support for open-ended (arbitrary-length) variadics * Tests for all of the above
I've run across a few additional questions:
1. PEP 484 indicates that if a type argument is omitted from a generic type, that type argument is assumed to be `Any`. What is the assumption with a variadic TypeVar? Should it default to `()` (empty tuple)? If we support open-ended tuples, then we could also opt for `(Any, ...)`.
2. What is the type of `def foo(*args: Ts) -> Union[Ts]` if foo is called with no arguments? In other words, what is the type of `Union[*()]`? Is it `Any`? Is this considered an error?
3. When the constraint solver is solving for a variadic type variable, does it need to solve for the individual elements of the tuple independently? Consider, for example, `def foo(a: Tuple[Ts], b: Tuple[Ts]) -> Tuple[Ts]`. Now, let's consider the expression `foo((3, "hi"), ("hi", 5.6))`? Would this be an error? Or would you expect that the constraint solver produce an answer of `Tuple[int | str, str | float]` (or `Tuple[object, object]`)? It's much easier to implement if we can treat this as an error, but I don't know if that satisfies the use cases you have in mind.
4. Along the lines of the previous question, consider the expression `foo((3, "hi"), ("hi", ))`. In this case, the lengths of the tuples don't match. If we don't support open-ended variadics, this needs to be an error. If we support open-ended variadics, we have the option of solving this as `Tuple[int | str, ...]` (or `Tuple[object, ...]`). Once again, it's easiest if we don't allow this and treat it as an error.
-- Eric Traut Contributor to Pyright and Pylance Microsoft Corp. _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: gohanpra@gmail.com
-- S Pradeep Kumar _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: mrahtz@google.com
On Mon, Jan 25, 2021 at 9:44 AM Eric Traut <eric@traut.com> wrote:
I've run across a few additional questions:
1. PEP 484 indicates that if a type argument is omitted from a generic type, that type argument is assumed to be `Any`. What is the assumption with a variadic TypeVar? Should it default to `()` (empty tuple)? If we support open-ended tuples, then we could also opt for `(Any, ...)`.
The default is actually as many copies of Any as are needed to make the type valid. So we should use (Any, …).
2. What is the type of `def foo(*args: Ts) -> Union[Ts]` if foo is called with no arguments? In other words, what is the type of `Union[*()]`? Is it `Any`? Is this considered an error?
The type of foo() would be NoReturn, since there is no valid value in an empty union, and NoReturn is how we spell the type with no values. It could also be an error. Any seems just wrong.
3. When the constraint solver is solving for a variadic type variable, does it need to solve for the individual elements of the tuple independently? Consider, for example, `def foo(a: Tuple[Ts], b: Tuple[Ts]) -> Tuple[Ts]`. Now, let's consider the expression `foo((3, "hi"), ("hi", 5.6))`? Would this be an error? Or would you expect that the constraint solver produce an answer of `Tuple[int | str, str | float]` (or `Tuple[object, object]`)? It's much easier to implement if we can treat this as an error, but I don't know if that satisfies the use cases you have in mind.
I'd prefer it if I could think of e.g. `def foo(a: Tuple[Ts], b: Tuple[Ts])` as a series of overloads including `def foo(a: Tuple[T1, T2], b: Tuple[T1, T2])`. That should answer the question, right? Ts stands for `(T1, T2, …, Tn)` for some n (we seem to have an issue about whether n can be zero). If different checkers produce different answers for the latter, e.g. due to different attitudes about unions, that's okay, but checkers should be consistent with themselves.
4. Along the lines of the previous question, consider the expression `foo((3, "hi"), ("hi", ))`. In this case, the lengths of the tuples don't match. If we don't support open-ended variadics, this needs to be an error. If we support open-ended variadics, we have the option of solving this as `Tuple[int | str, ...]` (or `Tuple[object, ...]`). Once again, it's easiest if we don't allow this and treat it as an error.
Interesting example. This makes me wonder if open-ended-ness should be an option to the TypeVar/TypeVarTuple definition? (Didn't we discuss that before? -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
Aargh, too many messages! - Constructor argument naming: For experimental use (before the bikeshed color is settled) I'd actually prefer TypeVarTuple, because it's a special import, and if it can't be imported that's a cleaner signal than a complaint about a parameter name. Agreed that a unique name like variadic is helpful. - Packed vs. Unpacked: Agreed that it's easier to experiment without the `*` syntax; and requiring `Unpack[Ts]` might be tedious for a prototype/experiment. But the parser work is not the reason to reject this. We should consider usability. Remember that PEP 637 will add support for `*` in all the right places anyway. (And certainly not in slices.) So the other work will (likely) have to happen anyway. - PEP splitting: Yes. But eventually I want `Map[]` so that we can finally type zip() and map() properly -- this was the original impetus for variadics . - Concatenation: Okay, supporting only `Tuple[t1, t2, Ts]` and not more complicated things like `Tuple[Ts1, Ts2]` or `Tuple[t1, Ts, t2]` is fine for the initial evaluation. (But thanks Pradeep for implementing it!) - Concatenation in Callable: Okay, I can't think of any practical examples that aren't solved by ParamSpec either. On Mon, Jan 25, 2021 at 8:46 AM Matthew Rahtz via Typing-sig < typing-sig@python.org> wrote:
*Jukka, Ivan, Rebecca: *You can skip to your names below.
***Constructor argument naming***
The thing I like about a mouthful like "variadic" is that if you encounter it and don't know what it means you know you need to go look it up.
Hmm, good point. Let's keep this argument in mind if we do decide to use `TypeVar` as the constructor (which I'm still not sure we should).
***Packed vs unpacked***
I'm a bit uncomfortable introducing a naming convention that makes significant changes to typing interpretation. That feels rather subtle.
If I were to see `Tuple[Ts]` I'd have to remember that the naming convention tells me that the tuple is variadic. Readability Counts. I think the grammar change is for the best. It values explicitness over magic.
Hearing this opinion from Guido in particular updates me significantly. Still, I find myself wondering whether the small improvement in readability (of something that's only likely to be used in library code and therefore not terribly user-facing) is worth the cost to updating the parsers of at least CPython, Mypy, pytype and Pyright.
The main crux for me here is the exact degree of difficulty. Eric, when you said it would be a 'heavy lift', were you thinking mainly because of the complications of concatenation and Map? Based on your experiments in Pyright so far, how difficult would introducing the new grammar be? (***Jukka, Ivan, Rebecca***, it would be super useful to hear your thoughts here too. In case a catch-up would be helpful: the question is, if we wanted to make it so that our new 'variadic type variable' was used as `Ts = TypeVarTuple('Ts'); class Foo(Generic[*Ts]): ...; foo: Foo[int, str] = Foo()`, how hard would that be, considering the new use of the star?)
***PEP splitting***
The latest draft of the PEP is: https://github.com/python/peps/pull/1781
I've split `Map` and fancier concatenation off into separate documents ( https://docs.google.com/document/d/1szTVcFyLznoDT7phtT-6Fpvp27XaBw9DmbTLHrB6... and https://docs.google.com/document/d/1sUBlow40J7UwdTSyRYAj34ozkGOlMEjPaVEWeOmM..., respectively, though haven't cleaned them up yet).
I've also tried tentatively rewriting it so that a `TypeVarTuple` behaves as if unpacked by default, eliminating the star. Everything does still seem to work, and admittedly the PEP seems much simpler.
***Concatenation***
Can you clarify what "no concatenation of variadics" refers to? Does this mean we can't (yet) have `Tuple[int, *Ts]`? Or is that specifically about `Tuple[*Ts1, *Ts2]`. (And what about the same constructs inside `Callable[[<here>], R]`?
I like Eric's proposal of a) only prefixing is allowed, and b) allowing only a single variadic type variable. For `Callable`, I think we shouldn't allow concatenation at all, at least not in this PEP - a) because it's simpler, b) because if we did `Callable[[int, Ts], R]` then the first argument would have to be positional-only, and that feels like it's going to have complications I haven't thought through yet, and c) because I expect most use-cases would be covered by PEP 612. I've updated the draft correspondingly.
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
I've done a first-cut implementation of PEP 646 in pyright 1.1.107. I just published this version, so you can try it by installing the Pyright extension in VS Code. Specific notes about my current implementation: * It supports `TypeVarTuple`, which is exported by the typing_extensions.pyi that ships with pyright. * It supports `Unpack`, also exported by typing_extensions.pyi. * It does not currently support `*` syntax, since that will be introduced with PEP 637 functionality. * It does not allow packed usage of a TypeVarTuple. All uses of a TypeVarTuple must be contained with an `Unpack`, and errors are generated if they are not. * If a TypeVarTuple appears within a subscript for a type annotation, it must be the last entry (i.e. no suffixes). The one exception is `Union`, which allows it to appear anywhere. I figured this was justified because the order of type arguments within a `Union` are not relevant. * If a TypeVarTuple appears within a class declaration, only one is allowed, and it must be after all other type variables. The order can be forced by including an explicit `Generic` that defines the type parameter ordering. * At most one TypeVarTuple can appear within a subscript when specializing a class (e.g. `Tuple[Ts1, Ts2]` is an error). The one exception is `Union`, which allows for multiple TypeVarTuples to appear. This creates an ambiguity for the constraint solver, but this ambiguity already exists for traditional TypeVars. * Type aliases may contain at most one TypeVarTuple, and it must be after all other TypeVars that parameterize the type alias (e.g. `Alias1 = Union[List[T], Tuple[Unpack[Ts]]]` is allowed, but `Alias2 = Union[Tuple[Unpack[Ts]], List[T]]` is an error. Unlike with class declarations, there's no way to force the ordering of type parameters within a type alias, which is somewhat constraining. * An attempt to assign an open-ended tuple to a TypeVarTuple during constrain solving results in an error. * A Callable may include a TypeVarTuple within its parameter type list, but only one is allowed, and it must be in the last entry. Other cases are flags as errors. * If a TypeVarTuple appears more than once in a function signature, the tuples that are assigned to it must match in both length and in type. There is no attempt to widen the type to accommodate differences. As I anticipated, this was a very large and complex feature to implement. If you're curious, here's the [commit](https://github.com/microsoft/pyright/commit/1d06018908819e17daa08328a64e6e1d...). I've implemented a bunch of test cases. Perhaps these will be of use for other type checker maintainers as when they add support for this PEP. The test samples can be found [here](https://github.com/microsoft/pyright/blob/master/packages/pyright-internal/s...). (There are 8 test files and many dozens of test cases for this feature currently.) Feedback and bug reports are welcome. -- Eric Traut Contributor to Pyright and Pylance Microsoft Corp.
Wow, Eric, that was fast! Thanks for your great work! :) ***PEP draft***: I've updated the current draft of the PEP at https://github.com/python/peps/pull/1781 to reflect the decisions we've made. I think it now more or less reflects the behaviour in Pyright's implementation (minus aliases, which I've yet to rewrite the PEP section for). One small thing that's different to our discussion is the behaviour of a ` Union` of an empty `TypeVarTuple`: I realised that assigning a type of ` NoReturn` to such `Union`s would only make sense if the `Union` in question was in a return annotation, so I've stuck with saying that the type-checker should produce an error so that the behaviour is consistent between `Union` in returns and `Union` elsewhere. ***Reply to Guido***
No, that's not how type variables work at all!
Ahh, thanks for clarifying. This was pretty eye-opening. I've tried to make the expected behaviour for `TypeVarTuple` explicit in the current draft of the PEP by saying that we disallow `Tuple[Union[A, B]]`, and that types must match exactly. (I haven't mentioned the 'class hierarchies' case in the draft because we've defined `TypeVarTuple` as invariant for the time being.) On Sun, 31 Jan 2021 at 01:30, Eric Traut <eric@traut.com> wrote:
I've done a first-cut implementation of PEP 646 in pyright 1.1.107. I just published this version, so you can try it by installing the Pyright extension in VS Code.
Specific notes about my current implementation: * It supports `TypeVarTuple`, which is exported by the typing_extensions.pyi that ships with pyright. * It supports `Unpack`, also exported by typing_extensions.pyi. * It does not currently support `*` syntax, since that will be introduced with PEP 637 functionality. * It does not allow packed usage of a TypeVarTuple. All uses of a TypeVarTuple must be contained with an `Unpack`, and errors are generated if they are not. * If a TypeVarTuple appears within a subscript for a type annotation, it must be the last entry (i.e. no suffixes). The one exception is `Union`, which allows it to appear anywhere. I figured this was justified because the order of type arguments within a `Union` are not relevant. * If a TypeVarTuple appears within a class declaration, only one is allowed, and it must be after all other type variables. The order can be forced by including an explicit `Generic` that defines the type parameter ordering. * At most one TypeVarTuple can appear within a subscript when specializing a class (e.g. `Tuple[Ts1, Ts2]` is an error). The one exception is `Union`, which allows for multiple TypeVarTuples to appear. This creates an ambiguity for the constraint solver, but this ambiguity already exists for traditional TypeVars. * Type aliases may contain at most one TypeVarTuple, and it must be after all other TypeVars that parameterize the type alias (e.g. `Alias1 = Union[List[T], Tuple[Unpack[Ts]]]` is allowed, but `Alias2 = Union[Tuple[Unpack[Ts]], List[T]]` is an error. Unlike with class declarations, there's no way to force the ordering of type parameters within a type alias, which is somewhat constraining. * An attempt to assign an open-ended tuple to a TypeVarTuple during constrain solving results in an error. * A Callable may include a TypeVarTuple within its parameter type list, but only one is allowed, and it must be in the last entry. Other cases are flags as errors. * If a TypeVarTuple appears more than once in a function signature, the tuples that are assigned to it must match in both length and in type. There is no attempt to widen the type to accommodate differences.
As I anticipated, this was a very large and complex feature to implement. If you're curious, here's the [commit]( https://github.com/microsoft/pyright/commit/1d06018908819e17daa08328a64e6e1d... ).
I've implemented a bunch of test cases. Perhaps these will be of use for other type checker maintainers as when they add support for this PEP. The test samples can be found [here]( https://github.com/microsoft/pyright/blob/master/packages/pyright-internal/s...). (There are 8 test files and many dozens of test cases for this feature currently.)
Feedback and bug reports are welcome.
-- Eric Traut Contributor to Pyright and Pylance Microsoft Corp. _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: mrahtz@google.com
# Unbounded Tuples I've added support for accepting `Ts` as an unbounded tuple `Tuple[Any, ...]` or `Tuple[int, ...]`. The type inference rules were similar to that in TypeScript (https://github.com/microsoft/TypeScript/pull/39094). This meant we could do: ``` def foo(xs: Tuple[*Ts]) -> Tuple[*Ts]: ... def baz() -> None: unbounded_tuple: Tuple[int, ...] z = foo(unbounded_tuple) # => Tuple[int, ...] reveal_type(z) def foo2(xs: Tuple[T, *Tuple[str, ...]]) -> T: ... def baz2() -> None: some_tuple: Tuple[int, str, str] z = foo(some_tuple) # => int reveal_type(z) ``` I'm ambivalent about allowing such explicit unpacking of `Tuple[int, ...]`. Given that we need it for arbitrary-rank `Tensor` anyway, it seems cleaner to allow it, but it may be confusing for users. Another thing we may want to consider in a future PEP: setting a bound for `Ts`. For example, we may want Tensor parameters to be bound by int: i.e., allow `Tensor[Literal[480], Literal[360]]` but not `Tensor[str, str]`. This could be done by essentially setting `TypeVarTuple("Ts", bound=Tuple[int, ...])`. This might be useful for future type arithmetic, since we'd need to be able to safely say something like `def flatten(x: Tensor[*Ts]) -> Tensor[Product[Ts]]:`. # Arbitrary-rank Tensors For gradual typing, we'd need to allow `x: Tensor`. Some library functions may be using `Tensor` without parameters until they are migrated to variadics. Calling them should not raise errors. So, I treated `Tensor` without parameters as `Tensor[*Tuple[Any, ...]]`. (As Guido pointed out, `Tensor[Any, ...]` is not valid syntax.) Gradual typing has two main requirements: (a) `Tensor[int, str]` should be compatible with `Tensor` ``` def expects_arbitrary_tensor(x: Tensor) -> Tensor: ... def bar() -> None: tensor: Tensor[int, str] y = expects_arbitrary_tensor(tensor) reveal_type(y) ``` (b) `Tensor` should be compatible with a concrete `Tensor[int, str]` ```python def expects_concrete_tensor(x: Tensor[int, str]) -> Tensor[int, str]: ... def bar() -> None: tensor: Tensor expects_concrete_tensor(tensor) ``` (This is analogous to `List[Any]` being compatible with `List[int]` and vice versa.) By default, both raised an error because Tensor is invariant. That is, we had to check that its parameters were compatible in both directions: (a) `[int, str]` is compatible with `[*Tuple[Any, ...]]` and (b) `[*Tuple[Any, ...]]` is compatible with `[int, str]`. To be explicit, (b) is equivalent to checking that `Tuple[Any, ...]` is compatible with `Tuple[int, str]`. That is a problem because we don't generally consider `Tuple[Any, ...]` to be compatible with `Tuple[int, str]`. For example, Mypy raises an error: ```python from typing import Any, Tuple def expects_concrete_tuple(x: Tuple[int, str]) -> None: ... def bar() -> None: unbounded_tuple: Tuple[Any, ...] # main.py:9: error: Argument 1 to "expects_concrete_tuple" has incompatible type "Tuple[Any, ...]"; expected "Tuple[int, str]" y = expects_concrete_tuple(unbounded_tuple) reveal_type(y) ``` To work around this, we could either (i) allow Tuple[Any, ...] in general to be compatible with Tuple[int, str], or (ii) special-case variadic classes like Tensor so that `Tensor` is compatible with `Tensor[int, str]` and vice versa. Both are unsound. The tuple or tensor we pass in may have zero elements and may thus cause a runtime error. Or its element may be a type that can't be used as an `int` or `str`, which is again a runtime error. However, option (ii) is less invasive, so I went with it. The Tensor examples typechecked fine. Let me know if anyone has strong opinions about option (i). (Test cases: https://github.com/pradeep90/pyre-check/blob/master/source/analysis/test/int... ) **** I'll add these points to the PEP. I'll work on merging my changes into Pyre master, but this might take a few weeks because I'll have to replace the existing ListVariadic implementation. On Sun, Jan 31, 2021 at 11:17 AM Matthew Rahtz via Typing-sig < typing-sig@python.org> wrote:
Wow, Eric, that was fast! Thanks for your great work! :)
***PEP draft***: I've updated the current draft of the PEP at https://github.com/python/peps/pull/1781 to reflect the decisions we've made. I think it now more or less reflects the behaviour in Pyright's implementation (minus aliases, which I've yet to rewrite the PEP section for).
One small thing that's different to our discussion is the behaviour of a ` Union` of an empty `TypeVarTuple`: I realised that assigning a type of ` NoReturn` to such `Union`s would only make sense if the `Union` in question was in a return annotation, so I've stuck with saying that the type-checker should produce an error so that the behaviour is consistent between `Union` in returns and `Union` elsewhere.
***Reply to Guido***
No, that's not how type variables work at all!
Ahh, thanks for clarifying. This was pretty eye-opening. I've tried to make the expected behaviour for `TypeVarTuple` explicit in the current draft of the PEP by saying that we disallow `Tuple[Union[A, B]]`, and that types must match exactly. (I haven't mentioned the 'class hierarchies' case in the draft because we've defined `TypeVarTuple` as invariant for the time being.)
On Sun, 31 Jan 2021 at 01:30, Eric Traut <eric@traut.com> wrote:
I've done a first-cut implementation of PEP 646 in pyright 1.1.107. I just published this version, so you can try it by installing the Pyright extension in VS Code.
Specific notes about my current implementation: * It supports `TypeVarTuple`, which is exported by the typing_extensions.pyi that ships with pyright. * It supports `Unpack`, also exported by typing_extensions.pyi. * It does not currently support `*` syntax, since that will be introduced with PEP 637 functionality. * It does not allow packed usage of a TypeVarTuple. All uses of a TypeVarTuple must be contained with an `Unpack`, and errors are generated if they are not. * If a TypeVarTuple appears within a subscript for a type annotation, it must be the last entry (i.e. no suffixes). The one exception is `Union`, which allows it to appear anywhere. I figured this was justified because the order of type arguments within a `Union` are not relevant. * If a TypeVarTuple appears within a class declaration, only one is allowed, and it must be after all other type variables. The order can be forced by including an explicit `Generic` that defines the type parameter ordering. * At most one TypeVarTuple can appear within a subscript when specializing a class (e.g. `Tuple[Ts1, Ts2]` is an error). The one exception is `Union`, which allows for multiple TypeVarTuples to appear. This creates an ambiguity for the constraint solver, but this ambiguity already exists for traditional TypeVars. * Type aliases may contain at most one TypeVarTuple, and it must be after all other TypeVars that parameterize the type alias (e.g. `Alias1 = Union[List[T], Tuple[Unpack[Ts]]]` is allowed, but `Alias2 = Union[Tuple[Unpack[Ts]], List[T]]` is an error. Unlike with class declarations, there's no way to force the ordering of type parameters within a type alias, which is somewhat constraining. * An attempt to assign an open-ended tuple to a TypeVarTuple during constrain solving results in an error. * A Callable may include a TypeVarTuple within its parameter type list, but only one is allowed, and it must be in the last entry. Other cases are flags as errors. * If a TypeVarTuple appears more than once in a function signature, the tuples that are assigned to it must match in both length and in type. There is no attempt to widen the type to accommodate differences.
As I anticipated, this was a very large and complex feature to implement. If you're curious, here's the [commit]( https://github.com/microsoft/pyright/commit/1d06018908819e17daa08328a64e6e1d... ).
I've implemented a bunch of test cases. Perhaps these will be of use for other type checker maintainers as when they add support for this PEP. The test samples can be found [here]( https://github.com/microsoft/pyright/blob/master/packages/pyright-internal/s...). (There are 8 test files and many dozens of test cases for this feature currently.)
Feedback and bug reports are welcome.
-- Eric Traut Contributor to Pyright and Pylance Microsoft Corp. _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: mrahtz@google.com
_______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: gohanpra@gmail.com
-- S Pradeep Kumar
I just published a new version of Pyright that implements full support for [PEP 637](https://www.python.org/dev/peps/pep-0637/) (indexing with keywords and unpack operators). This, in combination with the PEP 646 support, allows you to use the nicer syntax for variadic type variable unpacking (`*Ts` versus `Unpack[Ts]`). Enjoy! -- Eric Traut Contributor to Pyright and Pylance Microsoft Corp.
***Reply to Eric*** Basically, **big grin** :) ***Unbounded tuples*** Pradeep, do you think we should include support for unbounded tuples in the PEP? I'd prefer to hold off - partly because a) I'd prefer not to clutter the section on `Union` with the extra rules that unbounded tuples imply - but mainly because - well, actually, this is something I haven't said expounded on properly yet: I think it could be really intuitive if an ellipsis could eventually be used to represent the unknown parts of shapes, as in this doc https://docs.google.com/document/d/16r14MCVtd46whXwS4SdeiycIgIs0Dxvj8ePqH-Sp... I wrote as a summary of discussion in one of the tensor typing meetings. This is incompatible with what the ellipsis means in `Tuple` currently, and what we would be committing ourselves to if we did support unbounded tuples in this PEP. Basically, I want to postpone discussion on that bag of worms until this PEP is done. Is that alright, or do you think there are strong reasons to include it? ***Tensor[Any, ...]*** Ah, I'd missed Guido's comment that this is not valid syntax. Damn. `Tensor[*Tuple[Any, ...]]` is certainly one option, but for the reasons in the previous section I don't like that it forces us into allowing unboundedness. Could we perhaps just say that an unparameterized `Tensor` **behaves** as if it were `Tensor[Any, ...]` despite this not being valid syntax? How ugly a special-case would it be? ***Arbitrary-rank tensors*** Oh, man, super well-caught! You're right, committing to invariance by default does put us in a tricky situation. But then - trying this with a regular `TypeVar`, mypy seems to be happy with the following: ``` from typing import Generic, TypeVar T = TypeVar('T') class Tensor(Generic[T]): pass def expects_arbitrary_tensor(x: Tensor): pass def expects_concrete_tensor(x: Tensor[int]): pass x: Tensor = Tensor() expects_concrete_tensor(x) y: Tensor[int] = Tensor() expects_arbitrary_tensor(y) ``` Any idea why that works?
Both are unsound.
I'd actually be much more strongly in favour of the option where `Tuple[Any, ...]` is compatible with `Tuple[int, str]`. `TypeVar` is invariant by default too, but in order to support gradual typing doesn't it **have** to behave such that `Tuple[int]` is compatible with `Tuple[Any]`? Could you expand on why that first option is unsound? On Tue, 2 Feb 2021 at 04:57, Eric Traut <eric@traut.com> wrote:
I just published a new version of Pyright that implements full support for [PEP 637](https://www.python.org/dev/peps/pep-0637/) (indexing with keywords and unpack operators). This, in combination with the PEP 646 support, allows you to use the nicer syntax for variadic type variable unpacking (`*Ts` versus `Unpack[Ts]`). Enjoy!
-- Eric Traut Contributor to Pyright and Pylance Microsoft Corp. _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: mrahtz@google.com
On Tue, Feb 2, 2021 at 1:08 PM Matthew Rahtz via Typing-sig < typing-sig@python.org> wrote:
***Tensor[Any, ...]***
Ah, I'd missed Guido's comment that this is not valid syntax. Damn.
There's no need for profanity. :-) It's valid syntax to *Python* (which parses it no different than `Tuple[int, ...]`), but it's not given a valid meaning in PEP 484 or any of the follow-up PEPs, so mypy and pyright (AFAIK) don't allow it.
`Tensor[*Tuple[Any, ...]]` is certainly one option, but for the reasons in the previous section I don't like that it forces us into allowing unboundedness.
Could we perhaps just say that an unparameterized `Tensor` **behaves** as if it were `Tensor[Any, ...]` despite this not being valid syntax? How ugly a special-case would it be?
I'm fine with that.
***Arbitrary-rank tensors***
Oh, man, super well-caught! You're right, committing to invariance by default does put us in a tricky situation.
But then - trying this with a regular `TypeVar`, mypy seems to be happy with the following:
``` from typing import Generic, TypeVar
T = TypeVar('T')
class Tensor(Generic[T]): pass
def expects_arbitrary_tensor(x: Tensor): pass
def expects_concrete_tensor(x: Tensor[int]): pass
x: Tensor = Tensor() expects_concrete_tensor(x)
y: Tensor[int] = Tensor() expects_arbitrary_tensor(y) ```
Any idea why that works?
Because Any is special. It acts as if it is both a superclass and a subclass of every type, so whenever types are compared, Any is *always* allowed. It is defined this way in PEP 484. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
On Tue, Feb 2, 2021 at 1:08 PM Matthew Rahtz via Typing-sig < typing-sig@python.org> wrote:
***Unbounded tuples***
Pradeep, do you think we should include support for unbounded tuples in the PEP? I'd prefer to hold off
I'm ok with just allowing `Tensor` to be the only unbounded variadic allowed. It would be implicitly treated as `Tensor[*Tuple[Any, ...]]`. So, we wouldn't allow explicitly using `Tensor[*Tuple[int, ...]]`.
***Arbitrary-rank tensors***
Oh, man, super well-caught! You're right, committing to invariance by default does put us in a tricky situation.
But then - trying this with a regular `TypeVar`, mypy seems to be happy with the following:
``` from typing import Generic, TypeVar
T = TypeVar('T')
class Tensor(Generic[T]): pass
def expects_arbitrary_tensor(x: Tensor): pass
def expects_concrete_tensor(x: Tensor[int]): pass
x: Tensor = Tensor() expects_concrete_tensor(x)
y: Tensor[int] = Tensor() expects_arbitrary_tensor(y) ```
Any idea why that works?
A non-variadic generic class `Foo` without parameters resolves to `Foo[Any]`. As I'd mentioned, we consider `List[Any]` to be compatible with `List[int]` and vice versa, despite invariance. To work around this, we could either
(i) allow Tuple[Any, ...] in general to be compatible with Tuple[int, str], or (ii) special-case variadic classes like Tensor so that `Tensor` is compatible with `Tensor[int, str]` and vice versa.
Both are unsound.
I'd actually be much more strongly in favour of the option where `Tuple[Any, ...]` is compatible with `Tuple[int, str]`. `TypeVar` is invariant by default too, but in order to support gradual typing doesn't it **have** to behave such that `Tuple[int]` is compatible with `Tuple[Any]`?
Could you expand on why that first option is unsound?
Both are unsound. The tuple or tensor we pass in may have zero elements and may thus cause a runtime error. Or its element may be a type that can't be used as an `int` or `str`, which is again a runtime error.
Both are unsound for the same reasons. As I'd mentioned, we might pass an empty tuple to something that expects `Tuple[int, str]`, which would be a runtime error. For example, `x: Tuple[Any, ...] = (); foo(x)` where `def foo(x: Tuple[int, str]) -> None: x[0] + 1`. Or it might be a tuple with a non-int class as the dimension, which again would be a runtime error if used as an `int` or a `str`. For example, `x: Tuple[Any, ...] = ("hello",); foo(x)`. The first option is not backward-compatible because we would have to change existing errors about `Tuple[Any, ...]` not being compatible with `Tuple[int, str]`. But, yeah, I'd appreciate opinions on the above choice. -- S Pradeep Kumar
Ah, sorry, still getting my head around what 'soundness' means. Isn't the first option unsound in the same way that normal type variables are unsound, though? E.g. one could do the following: ``` T = TypeVar('T') class Tensor(Generic[T]): def __getitem__(self, value): return 0 # Dummy value def foo(x: Tensor): return x[0] x: Tensor[()] = Tensor() foo(x) ``` Mypy is fine with this, even though it wouldn't work at runtime if `Tensor` were given a complete implementation (since the argument `x` is zero-rank, so we shouldn't be able to index it). Option one would be unsound, but at least it would be unsound in a consistent way. On Tue, 2 Feb 2021 at 21:49, S Pradeep Kumar <gohanpra@gmail.com> wrote:
On Tue, Feb 2, 2021 at 1:08 PM Matthew Rahtz via Typing-sig < typing-sig@python.org> wrote:
***Unbounded tuples***
Pradeep, do you think we should include support for unbounded tuples in the PEP? I'd prefer to hold off
I'm ok with just allowing `Tensor` to be the only unbounded variadic allowed. It would be implicitly treated as `Tensor[*Tuple[Any, ...]]`. So, we wouldn't allow explicitly using `Tensor[*Tuple[int, ...]]`.
***Arbitrary-rank tensors***
Oh, man, super well-caught! You're right, committing to invariance by default does put us in a tricky situation.
But then - trying this with a regular `TypeVar`, mypy seems to be happy with the following:
``` from typing import Generic, TypeVar
T = TypeVar('T')
class Tensor(Generic[T]): pass
def expects_arbitrary_tensor(x: Tensor): pass
def expects_concrete_tensor(x: Tensor[int]): pass
x: Tensor = Tensor() expects_concrete_tensor(x)
y: Tensor[int] = Tensor() expects_arbitrary_tensor(y) ```
Any idea why that works?
A non-variadic generic class `Foo` without parameters resolves to `Foo[Any]`. As I'd mentioned, we consider `List[Any]` to be compatible with `List[int]` and vice versa, despite invariance.
To work around this, we could either
(i) allow Tuple[Any, ...] in general to be compatible with Tuple[int, str], or (ii) special-case variadic classes like Tensor so that `Tensor` is compatible with `Tensor[int, str]` and vice versa.
Both are unsound.
I'd actually be much more strongly in favour of the option where `Tuple[Any, ...]` is compatible with `Tuple[int, str]`. `TypeVar` is invariant by default too, but in order to support gradual typing doesn't it **have** to behave such that `Tuple[int]` is compatible with `Tuple[Any]`?
Could you expand on why that first option is unsound?
Both are unsound. The tuple or tensor we pass in may have zero elements and may thus cause a runtime error. Or its element may be a type that can't be used as an `int` or `str`, which is again a runtime error.
Both are unsound for the same reasons. As I'd mentioned, we might pass an empty tuple to something that expects `Tuple[int, str]`, which would be a runtime error. For example, `x: Tuple[Any, ...] = (); foo(x)` where `def foo(x: Tuple[int, str]) -> None: x[0] + 1`. Or it might be a tuple with a non-int class as the dimension, which again would be a runtime error if used as an `int` or a `str`. For example, `x: Tuple[Any, ...] = ("hello",); foo(x)`.
The first option is not backward-compatible because we would have to change existing errors about `Tuple[Any, ...]` not being compatible with `Tuple[int, str]`.
But, yeah, I'd appreciate opinions on the above choice. -- S Pradeep Kumar
I consider it a bug in mypy that it accepts `Tensor[()]`. Pyright does generate an error for this case. As you said, this expression causes a runtime exception. Interestingly, mypy does emit an error if you try to pass other illegal values as a type argument (like `Tensor[0]`). I don't think we should promote the use of "bare" generic types — those with no type arguments. That's typically indicative of an error on the programmer's part. Pyright accepts them, but it flags it as an error when "strict" mode is enabled. This check has been really useful in helping developers to fix problems in their code, and I don't want to water it down. So I'm not in favor of saying that `Tensor` is the only legitimate way to specify "a `Tensor` whose type arguments can be anything". If we think that concept is needed, then we should support `Tensor[Any, ...]`. As I said, adding support for open-ended tuples will add yet more complexity to the specification and implementation of this PEP, but maybe it's required to meet all of the intended use cases. -- Eric Traut Contributor to Pyright and Pylance Microsoft Corp.
On Wed, Feb 3, 2021 at 11:15 AM Eric Traut <eric@traut.com> wrote:
I consider it a bug in mypy that it accepts `Tensor[()]`. Pyright does generate an error for this case. As you said, this expression causes a runtime exception. Interestingly, mypy does emit an error if you try to pass other illegal values as a type argument (like `Tensor[0]`).
I think it's because mypy internally represents "parameters absent" as an empty tuple of parameters.
I don't think we should promote the use of "bare" generic types — those with no type arguments. That's typically indicative of an error on the programmer's part. Pyright accepts them, but it flags it as an error when "strict" mode is enabled. This check has been really useful in helping developers to fix problems in their code, and I don't want to water it down. So I'm not in favor of saying that `Tensor` is the only legitimate way to specify "a `Tensor` whose type arguments can be anything". If we think that concept is needed, then we should support `Tensor[Any, ...]`. As I said, adding support for open-ended tuples will add yet more complexity to the specification and implementation of this PEP, but maybe it's required to meet all of the intended use cases.
Agreed. I think they are mostly a legacy feature. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
On Wed, Feb 3, 2021 at 12:41 PM Guido van Rossum <guido@python.org> wrote:
On Wed, Feb 3, 2021 at 11:15 AM Eric Traut <eric@traut.com> wrote:
I don't think we should promote the use of "bare" generic types — those with no type arguments. That's typically indicative of an error on the programmer's part. Pyright accepts them, but it flags it as an error when "strict" mode is enabled. This check has been really useful in helping developers to fix problems in their code, and I don't want to water it down. So I'm not in favor of saying that `Tensor` is the only legitimate way to specify "a `Tensor` whose type arguments can be anything". If we think that concept is needed, then we should support `Tensor[Any, ...]`. As I said, adding support for open-ended tuples will add yet more complexity to the specification and implementation of this PEP, but maybe it's required to meet all of the intended use cases.
Agreed. I think they are mostly a legacy feature.
I just realized there's a situation where bare generic types are important: when evolving a codebase, making certain types that weren't generic before generic. An real-world example is the stdlib Queue type -- this started its life as a non-generic class (in typeshed) and at some point it was made generic. During the time it was non-generic, user code was annotated with things like `(q: Queue)`, and it would be a problem to fault all that code immediately as being non-compliant. So the interpretation of this as `Queue[Any]` makes sense. (Of course a flag exists to make this an error, for users who want to update all their code to declare the item type of their queues.) So it's not that bare generics themselves are a legacy feature -- but the feature is primarily important for situations where legacy code is being checked. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
I just realized there's a situation where bare generic types are important: when evolving a codebase, making certain types that weren't generic before generic.
Yeah, this is exactly the situation that I imagine we'll be in with array types: *hopefully* (at least in my personal ideal world) we'd be able to persuade library authors to make the existing types like `tf.Tensor`, ` np.ndarray` etc generic in shape.
So I'm not in favor of saying that `Tensor` is the only legitimate way to specify "a `Tensor` whose type arguments can be anything". If we think that concept is needed, then we should support `Tensor[Any, ...]`.
The idea I had in mind worked the opposite way around: it's less that ` Tensor` should be the only legitimate way of specifying arbitrary parameters, and more that arbitrary type parameters should be what `Tensor` means, in order to support gradual typing if `Tensor` becomes generic. For cases where the user wants to deliberately specify "a Tensor of any shape", I'm imagining they'd do something like: ``` Shape = TypeVarTuple('Shape') def pointwise_multiply(x: Tensor[*Shape], y: Tensor[*Shape]) -> Tensor[*Shape]: ... ``` This would specify that `x` and `y` can be an arbitrary shape, but they should be the same shape, and that shape would also be the shape of the returned `Tensor`. (If the user instead wanted to say that `x` and `y` could be arbitrary, different shapes, they would use different `TypeVarTuple` instances. This would, of course, potentially rely on the type-checker being ok with only a single usage of a `TypeVarTuple` in a function signature. From a quick skim of the analogous discussion about `TypeVar` in typing-sig a few months ago, I get the impression the jury is still out on whether that should be an error. But luckily, I think this use case is likely to be rare enough that we can ignore that issue for now. I can't think of any specific functions off the top of my head which would take arbitrary arrays of different shapes.)
I consider it a bug in mypy that it accepts `Tensor[()]`.
Interesting - do you say this mainly on the basis that since it causes a runtime error, it should also be a type error? To me, it feels more intuitive that this shouldn't be an error, based on the argument that ` Tensor` behaves like `Tensor[Any]`, and (without thinking about it too hard - i.e. not thinking about how the variance should work) `Tensor[Any]` seems like it should be compatible with `Tensor[()]`.
Interestingly, mypy does emit an error if you try to pass other illegal values as a type argument (like `Tensor[0]`).
I don't think we should promote the use of "bare" generic types — those with no type arguments. That's typically indicative of an error on the
I was surprised by this, so I did a bit more experimenting. Mypy does indeed error for me too on `Tensor[0]`, but that was only because of the lack of `Literal`. The following two examples *do* type-check fine for me with Mypy: ``` x: Tensor[Literal[0]] = Tensor() foo(x) x: Tensor[int] = Tensor() foo(x) ``` Having said all that - programmer's part. Pyright accepts them, but it flags it as an error when "strict" mode is enabled. This makes a lot of sense to me and I definitely think that such a flag should continue to exist and be prominently advertised. With the case of ` Tensor`, too, it would be super helpful to have the type-checker error with "Hey, you haven't specified what shape this `Tensor` should be!" On Thu, 4 Feb 2021 at 18:15, Guido van Rossum <guido@python.org> wrote:
On Wed, Feb 3, 2021 at 12:41 PM Guido van Rossum <guido@python.org> wrote:
On Wed, Feb 3, 2021 at 11:15 AM Eric Traut <eric@traut.com> wrote:
I don't think we should promote the use of "bare" generic types — those with no type arguments. That's typically indicative of an error on the programmer's part. Pyright accepts them, but it flags it as an error when "strict" mode is enabled. This check has been really useful in helping developers to fix problems in their code, and I don't want to water it down. So I'm not in favor of saying that `Tensor` is the only legitimate way to specify "a `Tensor` whose type arguments can be anything". If we think that concept is needed, then we should support `Tensor[Any, ...]`. As I said, adding support for open-ended tuples will add yet more complexity to the specification and implementation of this PEP, but maybe it's required to meet all of the intended use cases.
Agreed. I think they are mostly a legacy feature.
I just realized there's a situation where bare generic types are important: when evolving a codebase, making certain types that weren't generic before generic. An real-world example is the stdlib Queue type -- this started its life as a non-generic class (in typeshed) and at some point it was made generic. During the time it was non-generic, user code was annotated with things like `(q: Queue)`, and it would be a problem to fault all that code immediately as being non-compliant. So the interpretation of this as `Queue[Any]` makes sense. (Of course a flag exists to make this an error, for users who want to update all their code to declare the item type of their queues.)
So it's not that bare generics themselves are a legacy feature -- but the feature is primarily important for situations where legacy code is being checked.
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/> _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: mrahtz@google.com
I did some more thinking about the `Tensor[Any, ...]` issue. To recap, we said in the tensor typing meeting that although the following should clearly work fine... ``` def foo(x: Tensor): ... x: Tensor[Height, Width] foo(x) ``` ...it's unclear whether the following should work... ``` def bar(y: Tensor[Height, Width]): ... y: Tensor foo(y) ``` ...because of the fact that the following example does *not* work in Mypy, Pyre or Pyright: ``` def baz(z: Tuple[int, str]): ... z: Tuple foo(z) ``` The actual test code I'm using is: ``` from typing import Tuple def get_tuple() -> Tuple: return ('dummy_value',) def baz(z: Tuple[int, str]): ... z = get_tuple() reveal_type(z) baz(z) ``` Mypy says: ``` foo.py:9: note: Revealed type is 'builtins.tuple[Any]' foo.py:10: error: Argument 1 to "baz" has incompatible type "Tuple[Any, ...]"; expected "Tuple[int, str]" ``` Pyre says: ``` foo.py:9:0 Revealed type [-1]: Revealed type for `z` is `typing.Tuple[typing.Any, ...]`. foo.py:10:4 Incompatible parameter type [6]: Expected `Tuple[int, str]` for 1st positional only parameter to call `baz` but got `typing.Tuple[typing.Any, ...]`. ``` Pyright says: ``` 9:13 - info: Type of "z" is "Tuple[Unknown, ...]" 10:5 - error: Argument of type "Tuple[Unknown, ...]" cannot be assigned to parameter "z" of type "Tuple[int, str]" in function "baz" Tuple size mismatch; expected 2 but received indeterminate number (reportGeneralTypeIssues) ``` (Pytype is fine with it, but not sure whether that's intentional.) That seemed pretty conclusive, so based on Pyright's very helpful error message, I tried writing this up in the PEP, and my understanding of the rule is: * If an *arbitrary* number of type parameters are expected, then it *is* valid to pass a *specific* number of type parameters. * If a *specific* number of type parameters are expected, then it is *not* valid to pass an *arbitrary* number of type parameters. But that seems like kind of an ad-hoc rule to me, and left me feeling kind of dissatisfied. As far as I can tell, they're both unsound, so why should it work one way, but not the other? To see if there might be any wiggle room, I wondered what would happen in this analogous situation: ``` from typing import Tuple def get_tuple() -> Tuple: return ('dummy_value',) def baz(arg1, arg2): ... z = get_tuple() reveal_type(z) baz(*z) ``` Mypy, Pyre and Pyright are all happy with this example! Hah! So basically, I want to argue that, given the inconsistency between this example and the type parameter example, we should fix the inconsistency and make both cases behave in the same way, and I think it makes more sense to standardise on it working 'both ways' (`Tuple` is valid for `Tuple[int, str]`, and `Tuple[int, str]` is valid for `Tuple`). Pradeep, Eric, what do you think? On Sun, 7 Feb 2021 at 12:11, Matthew Rahtz <mrahtz@google.com> wrote:
I just realized there's a situation where bare generic types are important: when evolving a codebase, making certain types that weren't generic before generic.
Yeah, this is exactly the situation that I imagine we'll be in with array types: *hopefully* (at least in my personal ideal world) we'd be able to persuade library authors to make the existing types like `tf.Tensor`, ` np.ndarray` etc generic in shape.
So I'm not in favor of saying that `Tensor` is the only legitimate way to specify "a `Tensor` whose type arguments can be anything". If we think that concept is needed, then we should support `Tensor[Any, ...]`.
The idea I had in mind worked the opposite way around: it's less that ` Tensor` should be the only legitimate way of specifying arbitrary parameters, and more that arbitrary type parameters should be what `Tensor` means, in order to support gradual typing if `Tensor` becomes generic. For cases where the user wants to deliberately specify "a Tensor of any shape", I'm imagining they'd do something like:
```
Shape = TypeVarTuple('Shape')
def pointwise_multiply(x: Tensor[*Shape], y: Tensor[*Shape]) -> Tensor[*Shape]: ... ```
This would specify that `x` and `y` can be an arbitrary shape, but they should be the same shape, and that shape would also be the shape of the returned `Tensor`.
(If the user instead wanted to say that `x` and `y` could be arbitrary, different shapes, they would use different `TypeVarTuple` instances. This would, of course, potentially rely on the type-checker being ok with only a single usage of a `TypeVarTuple` in a function signature. From a quick skim of the analogous discussion about `TypeVar` in typing-sig a few months ago, I get the impression the jury is still out on whether that should be an error. But luckily, I think this use case is likely to be rare enough that we can ignore that issue for now. I can't think of any specific functions off the top of my head which would take arbitrary arrays of different shapes.)
I consider it a bug in mypy that it accepts `Tensor[()]`.
Interesting - do you say this mainly on the basis that since it causes a runtime error, it should also be a type error? To me, it feels more intuitive that this shouldn't be an error, based on the argument that ` Tensor` behaves like `Tensor[Any]`, and (without thinking about it too hard - i.e. not thinking about how the variance should work) `Tensor[Any]` seems like it should be compatible with `Tensor[()]`.
Interestingly, mypy does emit an error if you try to pass other illegal values as a type argument (like `Tensor[0]`).
I was surprised by this, so I did a bit more experimenting. Mypy does indeed error for me too on `Tensor[0]`, but that was only because of the lack of `Literal`. The following two examples *do* type-check fine for me with Mypy:
``` x: Tensor[Literal[0]] = Tensor() foo(x)
x: Tensor[int] = Tensor() foo(x) ```
Having said all that -
I don't think we should promote the use of "bare" generic types — those with no type arguments. That's typically indicative of an error on the programmer's part. Pyright accepts them, but it flags it as an error when "strict" mode is enabled.
This makes a lot of sense to me and I definitely think that such a flag should continue to exist and be prominently advertised. With the case of ` Tensor`, too, it would be super helpful to have the type-checker error with "Hey, you haven't specified what shape this `Tensor` should be!"
On Thu, 4 Feb 2021 at 18:15, Guido van Rossum <guido@python.org> wrote:
On Wed, Feb 3, 2021 at 12:41 PM Guido van Rossum <guido@python.org> wrote:
On Wed, Feb 3, 2021 at 11:15 AM Eric Traut <eric@traut.com> wrote:
I don't think we should promote the use of "bare" generic types — those with no type arguments. That's typically indicative of an error on the programmer's part. Pyright accepts them, but it flags it as an error when "strict" mode is enabled. This check has been really useful in helping developers to fix problems in their code, and I don't want to water it down. So I'm not in favor of saying that `Tensor` is the only legitimate way to specify "a `Tensor` whose type arguments can be anything". If we think that concept is needed, then we should support `Tensor[Any, ...]`. As I said, adding support for open-ended tuples will add yet more complexity to the specification and implementation of this PEP, but maybe it's required to meet all of the intended use cases.
Agreed. I think they are mostly a legacy feature.
I just realized there's a situation where bare generic types are important: when evolving a codebase, making certain types that weren't generic before generic. An real-world example is the stdlib Queue type -- this started its life as a non-generic class (in typeshed) and at some point it was made generic. During the time it was non-generic, user code was annotated with things like `(q: Queue)`, and it would be a problem to fault all that code immediately as being non-compliant. So the interpretation of this as `Queue[Any]` makes sense. (Of course a flag exists to make this an error, for users who want to update all their code to declare the item type of their queues.)
So it's not that bare generics themselves are a legacy feature -- but the feature is primarily important for situations where legacy code is being checked.
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/> _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: mrahtz@google.com
I think that in addition to theoretical purity we need to keep usability in mind. The original “asymmetric” tuple rules are designed to catch the most errors with the fewest false positives. It just doesn’t happen very often that you have a tuple of unknown size (to the checker) where the programmer knows it’s got a specific size and they want to pass it as such. If I were in that position I’d expect the need for a cast. OTOH a function that says it takes a tuple of any size should be happy with a tuple of a given size. (Assuming the element types match!) Doesn’t the same reasoning apply to tensor shape? For the *args example the usability equation is just a bit different. IOW, the saying about consistency and hobgoblins applies here. —Guido On Sun, Feb 14, 2021 at 05:31 Matthew Rahtz via Typing-sig < typing-sig@python.org> wrote:
I did some more thinking about the `Tensor[Any, ...]` issue.
To recap, we said in the tensor typing meeting that although the following should clearly work fine...
``` def foo(x: Tensor): ... x: Tensor[Height, Width] foo(x) ```
...it's unclear whether the following should work...
``` def bar(y: Tensor[Height, Width]): ... y: Tensor foo(y) ```
...because of the fact that the following example does *not* work in Mypy, Pyre or Pyright:
``` def baz(z: Tuple[int, str]): ... z: Tuple foo(z) ```
The actual test code I'm using is:
``` from typing import Tuple
def get_tuple() -> Tuple: return ('dummy_value',)
def baz(z: Tuple[int, str]): ...
z = get_tuple() reveal_type(z) baz(z) ```
Mypy says:
``` foo.py:9: note: Revealed type is 'builtins.tuple[Any]' foo.py:10: error: Argument 1 to "baz" has incompatible type "Tuple[Any, ...]"; expected "Tuple[int, str]" ```
Pyre says:
``` foo.py:9:0 Revealed type [-1]: Revealed type for `z` is `typing.Tuple[typing.Any, ...]`. foo.py:10:4 Incompatible parameter type [6]: Expected `Tuple[int, str]` for 1st positional only parameter to call `baz` but got `typing.Tuple[typing.Any, ...]`. ```
Pyright says:
``` 9:13 - info: Type of "z" is "Tuple[Unknown, ...]" 10:5 - error: Argument of type "Tuple[Unknown, ...]" cannot be assigned to parameter "z" of type "Tuple[int, str]" in function "baz" Tuple size mismatch; expected 2 but received indeterminate number (reportGeneralTypeIssues) ```
(Pytype is fine with it, but not sure whether that's intentional.)
That seemed pretty conclusive, so based on Pyright's very helpful error message, I tried writing this up in the PEP, and my understanding of the rule is:
* If an *arbitrary* number of type parameters are expected, then it *is* valid to pass a *specific* number of type parameters. * If a *specific* number of type parameters are expected, then it is *not* valid to pass an *arbitrary* number of type parameters.
But that seems like kind of an ad-hoc rule to me, and left me feeling kind of dissatisfied. As far as I can tell, they're both unsound, so why should it work one way, but not the other? To see if there might be any wiggle room, I wondered what would happen in this analogous situation:
``` from typing import Tuple
def get_tuple() -> Tuple: return ('dummy_value',)
def baz(arg1, arg2): ...
z = get_tuple() reveal_type(z) baz(*z) ```
Mypy, Pyre and Pyright are all happy with this example! Hah!
So basically, I want to argue that, given the inconsistency between this example and the type parameter example, we should fix the inconsistency and make both cases behave in the same way, and I think it makes more sense to standardise on it working 'both ways' (`Tuple` is valid for `Tuple[int, str]`, and `Tuple[int, str]` is valid for `Tuple`).
Pradeep, Eric, what do you think?
On Sun, 7 Feb 2021 at 12:11, Matthew Rahtz <mrahtz@google.com> wrote:
I just realized there's a situation where bare generic types are important: when evolving a codebase, making certain types that weren't generic before generic.
Yeah, this is exactly the situation that I imagine we'll be in with array types: *hopefully* (at least in my personal ideal world) we'd be able to persuade library authors to make the existing types like `tf.Tensor`, ` np.ndarray` etc generic in shape.
So I'm not in favor of saying that `Tensor` is the only legitimate way to specify "a `Tensor` whose type arguments can be anything". If we think that concept is needed, then we should support `Tensor[Any, ...]`.
The idea I had in mind worked the opposite way around: it's less that ` Tensor` should be the only legitimate way of specifying arbitrary parameters, and more that arbitrary type parameters should be what ` Tensor` means, in order to support gradual typing if `Tensor` becomes generic. For cases where the user wants to deliberately specify "a Tensor of any shape", I'm imagining they'd do something like:
```
Shape = TypeVarTuple('Shape')
def pointwise_multiply(x: Tensor[*Shape], y: Tensor[*Shape]) -> Tensor[*Shape]: ... ```
This would specify that `x` and `y` can be an arbitrary shape, but they should be the same shape, and that shape would also be the shape of the returned `Tensor`.
(If the user instead wanted to say that `x` and `y` could be arbitrary, different shapes, they would use different `TypeVarTuple` instances. This would, of course, potentially rely on the type-checker being ok with only a single usage of a `TypeVarTuple` in a function signature. From a quick skim of the analogous discussion about `TypeVar` in typing-sig a few months ago, I get the impression the jury is still out on whether that should be an error. But luckily, I think this use case is likely to be rare enough that we can ignore that issue for now. I can't think of any specific functions off the top of my head which would take arbitrary arrays of different shapes.)
I consider it a bug in mypy that it accepts `Tensor[()]`.
Interesting - do you say this mainly on the basis that since it causes a runtime error, it should also be a type error? To me, it feels more intuitive that this shouldn't be an error, based on the argument that ` Tensor` behaves like `Tensor[Any]`, and (without thinking about it too hard - i.e. not thinking about how the variance should work) `Tensor[Any]` seems like it should be compatible with `Tensor[()]`.
Interestingly, mypy does emit an error if you try to pass other illegal values as a type argument (like `Tensor[0]`).
I was surprised by this, so I did a bit more experimenting. Mypy does indeed error for me too on `Tensor[0]`, but that was only because of the lack of `Literal`. The following two examples *do* type-check fine for me with Mypy:
``` x: Tensor[Literal[0]] = Tensor() foo(x)
x: Tensor[int] = Tensor() foo(x) ```
Having said all that -
I don't think we should promote the use of "bare" generic types — those with no type arguments. That's typically indicative of an error on the programmer's part. Pyright accepts them, but it flags it as an error when "strict" mode is enabled.
This makes a lot of sense to me and I definitely think that such a flag should continue to exist and be prominently advertised. With the case of ` Tensor`, too, it would be super helpful to have the type-checker error with "Hey, you haven't specified what shape this `Tensor` should be!"
On Thu, 4 Feb 2021 at 18:15, Guido van Rossum <guido@python.org> wrote:
On Wed, Feb 3, 2021 at 12:41 PM Guido van Rossum <guido@python.org> wrote:
On Wed, Feb 3, 2021 at 11:15 AM Eric Traut <eric@traut.com> wrote:
I don't think we should promote the use of "bare" generic types — those with no type arguments. That's typically indicative of an error on the programmer's part. Pyright accepts them, but it flags it as an error when "strict" mode is enabled. This check has been really useful in helping developers to fix problems in their code, and I don't want to water it down. So I'm not in favor of saying that `Tensor` is the only legitimate way to specify "a `Tensor` whose type arguments can be anything". If we think that concept is needed, then we should support `Tensor[Any, ...]`. As I said, adding support for open-ended tuples will add yet more complexity to the specification and implementation of this PEP, but maybe it's required to meet all of the intended use cases.
Agreed. I think they are mostly a legacy feature.
I just realized there's a situation where bare generic types are important: when evolving a codebase, making certain types that weren't generic before generic. An real-world example is the stdlib Queue type -- this started its life as a non-generic class (in typeshed) and at some point it was made generic. During the time it was non-generic, user code was annotated with things like `(q: Queue)`, and it would be a problem to fault all that code immediately as being non-compliant. So the interpretation of this as `Queue[Any]` makes sense. (Of course a flag exists to make this an error, for users who want to update all their code to declare the item type of their queues.)
So it's not that bare generics themselves are a legacy feature -- but the feature is primarily important for situations where legacy code is being checked.
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/> _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: mrahtz@google.com
_______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: guido@python.org
-- --Guido (mobile)
Hah, it just so happens I was skimming PEP 8 today and noticed the section on hobgoblins :) Whether the same reasoning applies to tensor shapes I think depends on which parts of code we expect to get annotated with shapes first: * Case 1: If we expect library code to get annotated with specific shapes first, it'll break any old user code that tries to pass unparameterised tensors. * Case 2: On the other hand, if we expect user code to get annotated first, it'll be fine, because parameterised tensors can be passed to unparameterised library functions no problem. Admittedly, I do think case 2 is much more likely, but would be keen to hear from others if I'm missing something here. Also, though, if you're saying the rules for tuple were made that way deliberately, that updates me significantly towards thinking we should just respect those rules, advocating the use of a cast when otherwise necessary. Also *@Eric* and *@Pradeep* - I've pushed an initial version of support for PEP 646 to typing.py at https://github.com/python/cpython/pull/24527. Let me know if this looks sane. On Sun, 14 Feb 2021 at 17:00, Guido van Rossum <guido@python.org> wrote:
I think that in addition to theoretical purity we need to keep usability in mind.
The original “asymmetric” tuple rules are designed to catch the most errors with the fewest false positives. It just doesn’t happen very often that you have a tuple of unknown size (to the checker) where the programmer knows it’s got a specific size and they want to pass it as such. If I were in that position I’d expect the need for a cast.
OTOH a function that says it takes a tuple of any size should be happy with a tuple of a given size. (Assuming the element types match!)
Doesn’t the same reasoning apply to tensor shape?
For the *args example the usability equation is just a bit different.
IOW, the saying about consistency and hobgoblins applies here.
—Guido
On Sun, Feb 14, 2021 at 05:31 Matthew Rahtz via Typing-sig < typing-sig@python.org> wrote:
I did some more thinking about the `Tensor[Any, ...]` issue.
To recap, we said in the tensor typing meeting that although the following should clearly work fine...
``` def foo(x: Tensor): ... x: Tensor[Height, Width] foo(x) ```
...it's unclear whether the following should work...
``` def bar(y: Tensor[Height, Width]): ... y: Tensor foo(y) ```
...because of the fact that the following example does *not* work in Mypy, Pyre or Pyright:
``` def baz(z: Tuple[int, str]): ... z: Tuple foo(z) ```
The actual test code I'm using is:
``` from typing import Tuple
def get_tuple() -> Tuple: return ('dummy_value',)
def baz(z: Tuple[int, str]): ...
z = get_tuple() reveal_type(z) baz(z) ```
Mypy says:
``` foo.py:9: note: Revealed type is 'builtins.tuple[Any]' foo.py:10: error: Argument 1 to "baz" has incompatible type "Tuple[Any, ...]"; expected "Tuple[int, str]" ```
Pyre says:
``` foo.py:9:0 Revealed type [-1]: Revealed type for `z` is `typing.Tuple[typing.Any, ...]`. foo.py:10:4 Incompatible parameter type [6]: Expected `Tuple[int, str]` for 1st positional only parameter to call `baz` but got `typing.Tuple[typing.Any, ...]`. ```
Pyright says:
``` 9:13 - info: Type of "z" is "Tuple[Unknown, ...]" 10:5 - error: Argument of type "Tuple[Unknown, ...]" cannot be assigned to parameter "z" of type "Tuple[int, str]" in function "baz" Tuple size mismatch; expected 2 but received indeterminate number (reportGeneralTypeIssues) ```
(Pytype is fine with it, but not sure whether that's intentional.)
That seemed pretty conclusive, so based on Pyright's very helpful error message, I tried writing this up in the PEP, and my understanding of the rule is:
* If an *arbitrary* number of type parameters are expected, then it *is* valid to pass a *specific* number of type parameters. * If a *specific* number of type parameters are expected, then it is *not* valid to pass an *arbitrary* number of type parameters.
But that seems like kind of an ad-hoc rule to me, and left me feeling kind of dissatisfied. As far as I can tell, they're both unsound, so why should it work one way, but not the other? To see if there might be any wiggle room, I wondered what would happen in this analogous situation:
``` from typing import Tuple
def get_tuple() -> Tuple: return ('dummy_value',)
def baz(arg1, arg2): ...
z = get_tuple() reveal_type(z) baz(*z) ```
Mypy, Pyre and Pyright are all happy with this example! Hah!
So basically, I want to argue that, given the inconsistency between this example and the type parameter example, we should fix the inconsistency and make both cases behave in the same way, and I think it makes more sense to standardise on it working 'both ways' (`Tuple` is valid for `Tuple[int, str]`, and `Tuple[int, str]` is valid for `Tuple`).
Pradeep, Eric, what do you think?
On Sun, 7 Feb 2021 at 12:11, Matthew Rahtz <mrahtz@google.com> wrote:
I just realized there's a situation where bare generic types are important: when evolving a codebase, making certain types that weren't generic before generic.
Yeah, this is exactly the situation that I imagine we'll be in with array types: *hopefully* (at least in my personal ideal world) we'd be able to persuade library authors to make the existing types like ` tf.Tensor`, `np.ndarray` etc generic in shape.
So I'm not in favor of saying that `Tensor` is the only legitimate way to specify "a `Tensor` whose type arguments can be anything". If we think that concept is needed, then we should support `Tensor[Any, ...]`.
The idea I had in mind worked the opposite way around: it's less that ` Tensor` should be the only legitimate way of specifying arbitrary parameters, and more that arbitrary type parameters should be what ` Tensor` means, in order to support gradual typing if `Tensor` becomes generic. For cases where the user wants to deliberately specify "a Tensor of any shape", I'm imagining they'd do something like:
```
Shape = TypeVarTuple('Shape')
def pointwise_multiply(x: Tensor[*Shape], y: Tensor[*Shape]) -> Tensor[*Shape]: ... ```
This would specify that `x` and `y` can be an arbitrary shape, but they should be the same shape, and that shape would also be the shape of the returned `Tensor`.
(If the user instead wanted to say that `x` and `y` could be arbitrary, different shapes, they would use different `TypeVarTuple` instances. This would, of course, potentially rely on the type-checker being ok with only a single usage of a `TypeVarTuple` in a function signature. From a quick skim of the analogous discussion about `TypeVar` in typing-sig a few months ago, I get the impression the jury is still out on whether that should be an error. But luckily, I think this use case is likely to be rare enough that we can ignore that issue for now. I can't think of any specific functions off the top of my head which would take arbitrary arrays of different shapes.)
I consider it a bug in mypy that it accepts `Tensor[()]`.
Interesting - do you say this mainly on the basis that since it causes a runtime error, it should also be a type error? To me, it feels more intuitive that this shouldn't be an error, based on the argument that ` Tensor` behaves like `Tensor[Any]`, and (without thinking about it too hard - i.e. not thinking about how the variance should work) ` Tensor[Any]` seems like it should be compatible with `Tensor[()]`.
Interestingly, mypy does emit an error if you try to pass other illegal values as a type argument (like `Tensor[0]`).
I was surprised by this, so I did a bit more experimenting. Mypy does indeed error for me too on `Tensor[0]`, but that was only because of the lack of `Literal`. The following two examples *do* type-check fine for me with Mypy:
``` x: Tensor[Literal[0]] = Tensor() foo(x)
x: Tensor[int] = Tensor() foo(x) ```
Having said all that -
I don't think we should promote the use of "bare" generic types — those with no type arguments. That's typically indicative of an error on the programmer's part. Pyright accepts them, but it flags it as an error when "strict" mode is enabled.
This makes a lot of sense to me and I definitely think that such a flag should continue to exist and be prominently advertised. With the case of ` Tensor`, too, it would be super helpful to have the type-checker error with "Hey, you haven't specified what shape this `Tensor` should be!"
On Thu, 4 Feb 2021 at 18:15, Guido van Rossum <guido@python.org> wrote:
On Wed, Feb 3, 2021 at 12:41 PM Guido van Rossum <guido@python.org> wrote:
On Wed, Feb 3, 2021 at 11:15 AM Eric Traut <eric@traut.com> wrote:
I don't think we should promote the use of "bare" generic types — those with no type arguments. That's typically indicative of an error on the programmer's part. Pyright accepts them, but it flags it as an error when "strict" mode is enabled. This check has been really useful in helping developers to fix problems in their code, and I don't want to water it down. So I'm not in favor of saying that `Tensor` is the only legitimate way to specify "a `Tensor` whose type arguments can be anything". If we think that concept is needed, then we should support `Tensor[Any, ...]`. As I said, adding support for open-ended tuples will add yet more complexity to the specification and implementation of this PEP, but maybe it's required to meet all of the intended use cases.
Agreed. I think they are mostly a legacy feature.
I just realized there's a situation where bare generic types are important: when evolving a codebase, making certain types that weren't generic before generic. An real-world example is the stdlib Queue type -- this started its life as a non-generic class (in typeshed) and at some point it was made generic. During the time it was non-generic, user code was annotated with things like `(q: Queue)`, and it would be a problem to fault all that code immediately as being non-compliant. So the interpretation of this as `Queue[Any]` makes sense. (Of course a flag exists to make this an error, for users who want to update all their code to declare the item type of their queues.)
So it's not that bare generics themselves are a legacy feature -- but the feature is primarily important for situations where legacy code is being checked.
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/> _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: mrahtz@google.com
_______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: guido@python.org
-- --Guido (mobile)
On Sun, Feb 14, 2021 at 9:36 AM Matthew Rahtz <mrahtz@google.com> wrote:
Hah, it just so happens I was skimming PEP 8 today and noticed the section on hobgoblins :)
Whether the same reasoning applies to tensor shapes I think depends on which parts of code we expect to get annotated with shapes first:
* Case 1: If we expect library code to get annotated with specific shapes first, it'll break any old user code that tries to pass unparameterised tensors.
* Case 2: On the other hand, if we expect user code to get annotated first,
it'll be fine, because parameterised tensors can be passed to unparameterised library functions no problem.
Admittedly, I do think case 2 is much more likely, but would be keen to hear from others if I'm missing something here.
Also, though, if you're saying the rules for tuple were made that way deliberately, that updates me significantly towards thinking we should just respect those rules, advocating the use of a cast when otherwise necessary.
Also *@Eric* and *@Pradeep* - I've pushed an initial version of support for PEP 646 to typing.py at https://github.com/python/cpython/pull/24527. Let me know if this looks sane.
Thanks for working on this! Added comments on the PR.
On Sun, 14 Feb 2021 at 17:00, Guido van Rossum <guido@python.org> wrote:
I think that in addition to theoretical purity we need to keep usability in mind.
The original “asymmetric” tuple rules are designed to catch the most errors with the fewest false positives. It just doesn’t happen very often that you have a tuple of unknown size (to the checker) where the programmer knows it’s got a specific size and they want to pass it as such. If I were in that position I’d expect the need for a cast.
OTOH a function that says it takes a tuple of any size should be happy with a tuple of a given size. (Assuming the element types match!)
Doesn’t the same reasoning apply to tensor shape?
For the *args example the usability equation is just a bit different.
IOW, the saying about consistency and hobgoblins applies here.
—Guido
On Sun, Feb 14, 2021 at 05:31 Matthew Rahtz via Typing-sig < typing-sig@python.org> wrote:
I did some more thinking about the `Tensor[Any, ...]` issue.
To recap, we said in the tensor typing meeting that although the following should clearly work fine...
``` def foo(x: Tensor): ... x: Tensor[Height, Width] foo(x) ```
...it's unclear whether the following should work...
``` def bar(y: Tensor[Height, Width]): ... y: Tensor foo(y) ```
...because of the fact that the following example does *not* work in Mypy, Pyre or Pyright:
``` def baz(z: Tuple[int, str]): ... z: Tuple foo(z) ```
The actual test code I'm using is:
``` from typing import Tuple
def get_tuple() -> Tuple: return ('dummy_value',)
def baz(z: Tuple[int, str]): ...
z = get_tuple() reveal_type(z) baz(z) ```
Mypy says:
``` foo.py:9: note: Revealed type is 'builtins.tuple[Any]' foo.py:10: error: Argument 1 to "baz" has incompatible type "Tuple[Any, ...]"; expected "Tuple[int, str]" ```
Pyre says:
``` foo.py:9:0 Revealed type [-1]: Revealed type for `z` is `typing.Tuple[typing.Any, ...]`. foo.py:10:4 Incompatible parameter type [6]: Expected `Tuple[int, str]` for 1st positional only parameter to call `baz` but got `typing.Tuple[typing.Any, ...]`. ```
Pyright says:
``` 9:13 - info: Type of "z" is "Tuple[Unknown, ...]" 10:5 - error: Argument of type "Tuple[Unknown, ...]" cannot be assigned to parameter "z" of type "Tuple[int, str]" in function "baz" Tuple size mismatch; expected 2 but received indeterminate number (reportGeneralTypeIssues) ```
(Pytype is fine with it, but not sure whether that's intentional.)
That seemed pretty conclusive, so based on Pyright's very helpful error message, I tried writing this up in the PEP, and my understanding of the rule is:
* If an *arbitrary* number of type parameters are expected, then it *is* valid to pass a *specific* number of type parameters. * If a *specific* number of type parameters are expected, then it is *not* valid to pass an *arbitrary* number of type parameters.
But that seems like kind of an ad-hoc rule to me, and left me feeling kind of dissatisfied. As far as I can tell, they're both unsound, so why should it work one way, but not the other? To see if there might be any wiggle room, I wondered what would happen in this analogous situation:
``` from typing import Tuple
def get_tuple() -> Tuple: return ('dummy_value',)
def baz(arg1, arg2): ...
z = get_tuple() reveal_type(z) baz(*z) ```
Mypy, Pyre and Pyright are all happy with this example! Hah!
So basically, I want to argue that, given the inconsistency between this example and the type parameter example, we should fix the inconsistency and make both cases behave in the same way, and I think it makes more sense to standardise on it working 'both ways' (`Tuple` is valid for `Tuple[int, str]`, and `Tuple[int, str]` is valid for `Tuple`).
Pradeep, Eric, what do you think?
On Sun, 7 Feb 2021 at 12:11, Matthew Rahtz <mrahtz@google.com> wrote:
I just realized there's a situation where bare generic types are important: when evolving a codebase, making certain types that weren't generic before generic.
Yeah, this is exactly the situation that I imagine we'll be in with array types: *hopefully* (at least in my personal ideal world) we'd be able to persuade library authors to make the existing types like ` tf.Tensor`, `np.ndarray` etc generic in shape.
So I'm not in favor of saying that `Tensor` is the only legitimate way to specify "a `Tensor` whose type arguments can be anything". If we think that concept is needed, then we should support `Tensor[Any, ...]`.
The idea I had in mind worked the opposite way around: it's less that ` Tensor` should be the only legitimate way of specifying arbitrary parameters, and more that arbitrary type parameters should be what ` Tensor` means, in order to support gradual typing if `Tensor` becomes generic. For cases where the user wants to deliberately specify "a Tensor of any shape", I'm imagining they'd do something like:
```
Shape = TypeVarTuple('Shape')
def pointwise_multiply(x: Tensor[*Shape], y: Tensor[*Shape]) -> Tensor[*Shape]: ... ```
This would specify that `x` and `y` can be an arbitrary shape, but they should be the same shape, and that shape would also be the shape of the returned `Tensor`.
(If the user instead wanted to say that `x` and `y` could be arbitrary, different shapes, they would use different `TypeVarTuple` instances. This would, of course, potentially rely on the type-checker being ok with only a single usage of a `TypeVarTuple` in a function signature. From a quick skim of the analogous discussion about `TypeVar` in typing-sig a few months ago, I get the impression the jury is still out on whether that should be an error. But luckily, I think this use case is likely to be rare enough that we can ignore that issue for now. I can't think of any specific functions off the top of my head which would take arbitrary arrays of different shapes.)
I consider it a bug in mypy that it accepts `Tensor[()]`.
Interesting - do you say this mainly on the basis that since it causes a runtime error, it should also be a type error? To me, it feels more intuitive that this shouldn't be an error, based on the argument that ` Tensor` behaves like `Tensor[Any]`, and (without thinking about it too hard - i.e. not thinking about how the variance should work) ` Tensor[Any]` seems like it should be compatible with `Tensor[()]`.
Interestingly, mypy does emit an error if you try to pass other illegal values as a type argument (like `Tensor[0]`).
I was surprised by this, so I did a bit more experimenting. Mypy does indeed error for me too on `Tensor[0]`, but that was only because of the lack of `Literal`. The following two examples *do* type-check fine for me with Mypy:
``` x: Tensor[Literal[0]] = Tensor() foo(x)
x: Tensor[int] = Tensor() foo(x) ```
Having said all that -
I don't think we should promote the use of "bare" generic types — those with no type arguments. That's typically indicative of an error on the programmer's part. Pyright accepts them, but it flags it as an error when "strict" mode is enabled.
This makes a lot of sense to me and I definitely think that such a flag should continue to exist and be prominently advertised. With the case of ` Tensor`, too, it would be super helpful to have the type-checker error with "Hey, you haven't specified what shape this `Tensor` should be!"
On Thu, 4 Feb 2021 at 18:15, Guido van Rossum <guido@python.org> wrote:
On Wed, Feb 3, 2021 at 12:41 PM Guido van Rossum <guido@python.org> wrote:
On Wed, Feb 3, 2021 at 11:15 AM Eric Traut <eric@traut.com> wrote:
> I don't think we should promote the use of "bare" generic types — > those with no type arguments. That's typically indicative of an error on > the programmer's part. Pyright accepts them, but it flags it as an error > when "strict" mode is enabled. This check has been really useful in helping > developers to fix problems in their code, and I don't want to water it > down. So I'm not in favor of saying that `Tensor` is the only legitimate > way to specify "a `Tensor` whose type arguments can be anything". If we > think that concept is needed, then we should support `Tensor[Any, ...]`. As > I said, adding support for open-ended tuples will add yet more complexity > to the specification and implementation of this PEP, but maybe it's > required to meet all of the intended use cases. >
Agreed. I think they are mostly a legacy feature.
I just realized there's a situation where bare generic types are important: when evolving a codebase, making certain types that weren't generic before generic. An real-world example is the stdlib Queue type -- this started its life as a non-generic class (in typeshed) and at some point it was made generic. During the time it was non-generic, user code was annotated with things like `(q: Queue)`, and it would be a problem to fault all that code immediately as being non-compliant. So the interpretation of this as `Queue[Any]` makes sense. (Of course a flag exists to make this an error, for users who want to update all their code to declare the item type of their queues.)
So it's not that bare generics themselves are a legacy feature -- but the feature is primarily important for situations where legacy code is being checked.
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/> _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: mrahtz@google.com
_______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: guido@python.org
-- --Guido (mobile)
-- S Pradeep Kumar
On Sun, Feb 14, 2021 at 9:36 AM Matthew Rahtz <mrahtz@google.com> wrote:
Hah, it just so happens I was skimming PEP 8 today and noticed the section on hobgoblins :)
Whether the same reasoning applies to tensor shapes I think depends on which parts of code we expect to get annotated with shapes first:
* Case 1: If we expect library code to get annotated with specific shapes first, it'll break any old user code that tries to pass unparameterised tensors. * Case 2: On the other hand, if we expect user code to get annotated first, it'll be fine, because parameterised tensors can be passed to unparameterised library functions no problem.
Admittedly, I do think case 2 is much more likely, but would be keen to hear from others if I'm missing something here.
I'm not sure I agree with that. Library authors want to get fine details right (the efforts of the numpy folks show this desire) while users, once they have written code, would prefer it to never break, even if they upgrade to a newer version of the library. And they're willing to tweak the meaning of "break" to reach this nirvana. :-)
Also, though, if you're saying the rules for tuple were made that way deliberately, that updates me significantly towards thinking we should just respect those rules, advocating the use of a cast when otherwise necessary.
Those rules were made deliberately for tuples. I think a case can be made that the shape of a tensor is a different kind of animal. For one thing, a tuple of indeterminate length doesn't always mean something's incompletely typed. It could very well be one of the many use cases (some forced by the language or the stdlib) where tuples are used as immutable sequences. But I suspect that a tensor of unknown type always just means that the code was written before shapes were supported. And for those cases where a library function really does take a tensor of (almost) arbitrary shape, e.g. transpose(), maybe we should strive for a notation that is explicit about this, e.g. Tensor[..., X, Y] -> Tensor[..., Y. X]. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
I don't think it's feasible to change rules about tuple type compatibility at this point. Existing type stubs (including typeshed stubs) assume the current behavior. It's especially important for function overload matching. If we were to change the rules and make open-ended tuples assignable to fixed-sized tuples, the selected overload would change in some cases. I also agree with Guido's justification for the current behavior (in the case of tuples). So even if it were feasible to change the behavior at this point, I don't think it would make sense to do so. I'm not opposed to adopting different rules for (non-tuple) variadic generic classes if we think that the target use cases justify it. Of course, all things being equal, I'd favor consistency. Hopefully that's not considered a "foolish consistency". :) -- Eric Traut Contributor to Pyright & Pylance Microsoft Corp.
Ok, if we're saying that it wouldn't be too abhorrent to treat Tuple differently, I do think supporting both cases is important for Tensor: ``` def foo(x: Tensor): ... x: Tensor[Height, Width] foo(x) def bar(y: Tensor[Height, Width]): ... y: Tensor foo(y) ``` So I'll go ahead and state in the PEP that it should work both ways (and add some notes on our discussion about this to the Rationale section). On Mon, 15 Feb 2021 at 04:33, Eric Traut <eric@traut.com> wrote:
I don't think it's feasible to change rules about tuple type compatibility at this point. Existing type stubs (including typeshed stubs) assume the current behavior. It's especially important for function overload matching. If we were to change the rules and make open-ended tuples assignable to fixed-sized tuples, the selected overload would change in some cases.
I also agree with Guido's justification for the current behavior (in the case of tuples). So even if it were feasible to change the behavior at this point, I don't think it would make sense to do so.
I'm not opposed to adopting different rules for (non-tuple) variadic generic classes if we think that the target use cases justify it. Of course, all things being equal, I'd favor consistency. Hopefully that's not considered a "foolish consistency". :)
-- Eric Traut Contributor to Pyright & Pylance Microsoft Corp. _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: mrahtz@google.com
I just realised that a related question is how unparameterized *aliases* should behave. The easiest thing would be to have a general rule: "If you omit the type parameter list, you replace `*Ts` with an arbitrary number of `Any`". Then we could do: ``` DType = TypeVar('DType') Shape = TypeVarTuple('Shape') class Array(Generic[DType, *Shape]): ... Float32Array = Array[np.float32, *Shape] def takes_float_array_of_any_shape(x: Float32Array): ... x: Array[np.float32, Height] takes_float_array_of_any_shape(x) # Valid y: Array[np.float32, Height, Width] takes_float_array_of_any_shape(y) # Also valid ``` One complication, though, is that it implies: ``` IntTuple = Tuple[int, *Ts] IntTuple # Behaves like Tuple[int, Any, ...]!? ``` I'm guessing there was a good reason that heterogeneous tuples of arbitrary length like this weren't supported by PEP 484? Other options I can think of are: - Disallow unparameterized variadic generic aliases in general - This doesn't seem great because it would decrease backwards compatibility. Ideally, if a function in new library code takes a ` Float32Array` (`== Array[Any, Any, ...]`), we want to let legacy code pass in a plain `Array` (`== Array[Any, ...]`). - Also, it would be inconsistent with non-variadic generic aliases, where I think the type variables are just replaced as `Any` when unparameterized? - Pull the same trick we did with unparameterized classes: not making `Tuple[int, Any, ...]` valid, but just saying that an unparameterized alias *behaves* as if it had been written like that. On Wed, 17 Feb 2021 at 21:06, Matthew Rahtz <mrahtz@google.com> wrote:
Ok, if we're saying that it wouldn't be too abhorrent to treat Tuple differently, I do think supporting both cases is important for Tensor:
``` def foo(x: Tensor): ... x: Tensor[Height, Width] foo(x)
def bar(y: Tensor[Height, Width]): ... y: Tensor foo(y) ```
So I'll go ahead and state in the PEP that it should work both ways (and add some notes on our discussion about this to the Rationale section).
On Mon, 15 Feb 2021 at 04:33, Eric Traut <eric@traut.com> wrote:
I don't think it's feasible to change rules about tuple type compatibility at this point. Existing type stubs (including typeshed stubs) assume the current behavior. It's especially important for function overload matching. If we were to change the rules and make open-ended tuples assignable to fixed-sized tuples, the selected overload would change in some cases.
I also agree with Guido's justification for the current behavior (in the case of tuples). So even if it were feasible to change the behavior at this point, I don't think it would make sense to do so.
I'm not opposed to adopting different rules for (non-tuple) variadic generic classes if we think that the target use cases justify it. Of course, all things being equal, I'd favor consistency. Hopefully that's not considered a "foolish consistency". :)
-- Eric Traut Contributor to Pyright & Pylance Microsoft Corp. _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: mrahtz@google.com
I'll have to defer responding to the general idea of your latest post until later, but here's one quick quip: On Sat, Feb 20, 2021 at 2:14 PM Matthew Rahtz via Typing-sig < typing-sig@python.org> wrote:
``` IntTuple = Tuple[int, *Ts]
IntTuple # Behaves like Tuple[int, Any, ...]!? ```
I'm guessing there was a good reason that heterogeneous tuples of arbitrary length like this weren't supported by PEP 484?
I think we considered it as a future extension -- IIRC it's been proposed and we just didn't think there would be enough applications to warrant the extra complexity in mypy. I don't think there are particularly good theoretical reasons to reject it. -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
The latest draft is looking really good. I ran all of the samples through pyright and uncovered a few small errors. I pushed a PR that fixes these, and Guido already approved and merged the change. I noticed that the updated PEP includes a proposal for unparameterized generic type aliases. It currently indicates that a missing type argument for a variadic type var should be interpreted as a zero-length tuple. This interpretation would be inconsistent with every other case where type arguments are omitted, so I don't think that's the right answer. In my opinion, omitting a type argument for a variadic type parameter should imply `Tuple[Any, ...]`. That's consistent with how `Tuple` works. If the type alias is concatenating other types with the variadic type var into a tuple, then the entire tuple should become a `Tuple[Any ...]`. (Alternatively, we could extend PEP 484 to support open-ended tuples that specify one or more required entry types as Guido mentions above, but I think we should do that as a separate PEP if there is desire to do so.) Here's how my proposal would work: ```python SimpleTuple = Tuple[*Ts] IntTuple = Tuple[int, *Ts] # Missing type argument implies *Tuple[Any, ...] x1: SimpleTuple reveal_type(x1) # Tuple[Any, ...] # Missing type argument implies *Tuple[Any, ...] x2: IntTuple reveal_type(x2) # Tuple[Any, ...] # Explicit type argument with zero-length tuple x3: IntTuple[()] reveal_type(x3) # Tuple[int] ``` This is consistent with how pyright handles unpacked open-ended tuples in other situations today: ```python def func(*args): x1 = (*args, ) reveal_type(x1) # Tuple[Any, ...] x2 = (3, *args) reveal_type(x2) # Tuple[Any, ...] x3 = (3, *()) reveal_type(x3) # Tuple[Literal[3]] ``` (Mypy reveals the types of `x1` and `x2` as `builtins.tuple[Any]`. That looks like a bug to me. It should be `builtins.tuple[Any, ...]`.) There's one other thing that still troubles me with the latest draft. It shows up in this line: ```python ShapeType = TypeVar('ShapeType', Ndim, Shape) ``` In its strictest mode, pyright complains about both `Ndim` and `Shape` missing type arguments. It's a common mistake for developers to forget type arguments on generic types, so this is an important error. You can fix the problem with `Ndim` by changing it to `Ndim[Any]`, but you can't do the same with `Shape` because `Shape[Any]` would imply a single dimension for the variadic. The PEP specifies that `Shape` should be interpreted as `Shape[Any, ...]` when the type arguments are omitted, but there's no way to actually write this as a way to indicate "I know what I'm doing, I didn't simply forget to add the type arguments!". The proposal is effectively saying "the type system supports open-ended variadic generics, but the only way to specify them is to use a syntax that we want to discourage and will be considered an error in some type checkers. That strikes me as problematic. I can see two ways to fix this: 1. We make it illegal to omit type arguments for generic types and type aliases that use variadic type vars. 2. We support the [T, ...] notation for generic types and type aliases that use variadic type vars. We previously rejected option 1 because it's inconsistent with the precedent set in PEP 484 — and we wanted to allow developers to incrementally add types. We rejected option 2 because (if I remember correctly) Matthew had some ideas for other ways that the ellipsis could be used for variadic generic classes in the future. Given our choices, I'd like to push for option 2. I think there are other ways to accommodate future extensions without relying on ellipsis syntax. If we make this change, we could also bring back support for binding TypeVarTuples to open-ended tuples. -- Eric Traut Contributor to Pyright & Pylance Microsoft Corp.
I just realised that a related question is how unparameterized aliases should behave. Float32Array = Array[np.float32, *Shape]
A variadic class without parameters will bind any `Ts` to `Tuple[Any, ...]`. So, in the above example, `x: Float32Array` will resolve to `Array[np.float32, *Tuple[Any, ...]]`. The same goes for `class Array(Generic[T, *Ts]):` where the user says `x: Tensor`. We will resolve to `x: Tensor[Any, *Tuple[Any, ...]]`.
IntTuple = Tuple[int, *Ts]
Like above, this will be `Tuple[int, *Tuple[Any, ...]]`. This cleanly preserves the existing semantics and allows us to support gradual typing, without having to introduce new concepts like `Tuple[int, Any, ...]`. Given that `Tuple[int, *Tuple[Any, ...]]` expresses what we want, I don't think that we need to add syntax for `Tuple[int, Any, ...]`. Note that this is analogous to TypeScript's syntax: ```typescript declare function accepts_T_followed_by_any<T>(x: [T, ...any[]]): T; const x10: number = accepts_T_followed_by_any([1, "hello", 3]); ```
I pushed a PR that fixes these, and Guido already approved and merged the change.
The PEP specifies that `Shape` should be interpreted as `Shape[Any, ...]` when the type arguments are omitted, but there's no way to actually write
2. We support the [T, ...] notation for generic types and type aliases
Ah, thanks for noticing those, Eric. We had fixed some of these issues in our running Google Doc ( https://docs.google.com/document/d/1oXWyAtnv0-pbyJud8H5wkpIk8aajbkX-leJ8JXsE...) but not yet updated the PR. Will try to keep the PR more up-to-date. this as a way to indicate "I know what I'm doing, I didn't simply forget to add the type arguments!". The proposal is effectively saying "the type system supports open-ended variadic generics, but the only way to specify them is to use a syntax that we want to discourage and will be considered an error in some type checkers. That strikes me as problematic. that use variadic type vars. I've been ok with first-class support of unbounded tuples as valid Ts bindings. In any case, we need to implement it internally for supporting gradual typing. The only question had been whether we would surface this syntax to users. The main reason for not allowing users to write `Tensor[int, *Tuple[Any, ...]]` was that it might be confusing. If that is not a big concern, we can support this syntax explicitly. (Note that this is what TypeScript does.) I favor `Tensor[int, *Tuple[Any, ...]]` over new syntax like `Tensor[int, ...]` because the former allows for more nuanced types like `Tensor[int, str, *Tuple[Any, ...], T]` whereas the latter doesn't. It's also a clear analogy that we are replacing `Ts` with `Tuple[Any, ...]`. Thoughts? On Sat, Feb 20, 2021 at 4:30 PM Eric Traut <eric@traut.com> wrote:
The latest draft is looking really good.
I ran all of the samples through pyright and uncovered a few small errors. I pushed a PR that fixes these, and Guido already approved and merged the change.
I noticed that the updated PEP includes a proposal for unparameterized generic type aliases. It currently indicates that a missing type argument for a variadic type var should be interpreted as a zero-length tuple. This interpretation would be inconsistent with every other case where type arguments are omitted, so I don't think that's the right answer.
In my opinion, omitting a type argument for a variadic type parameter should imply `Tuple[Any, ...]`. That's consistent with how `Tuple` works.
If the type alias is concatenating other types with the variadic type var into a tuple, then the entire tuple should become a `Tuple[Any ...]`. (Alternatively, we could extend PEP 484 to support open-ended tuples that specify one or more required entry types as Guido mentions above, but I think we should do that as a separate PEP if there is desire to do so.)
Here's how my proposal would work: ```python SimpleTuple = Tuple[*Ts] IntTuple = Tuple[int, *Ts]
# Missing type argument implies *Tuple[Any, ...] x1: SimpleTuple reveal_type(x1) # Tuple[Any, ...]
# Missing type argument implies *Tuple[Any, ...] x2: IntTuple reveal_type(x2) # Tuple[Any, ...]
# Explicit type argument with zero-length tuple x3: IntTuple[()] reveal_type(x3) # Tuple[int] ```
This is consistent with how pyright handles unpacked open-ended tuples in other situations today:
```python def func(*args): x1 = (*args, ) reveal_type(x1) # Tuple[Any, ...]
x2 = (3, *args) reveal_type(x2) # Tuple[Any, ...]
x3 = (3, *()) reveal_type(x3) # Tuple[Literal[3]] ```
(Mypy reveals the types of `x1` and `x2` as `builtins.tuple[Any]`. That looks like a bug to me. It should be `builtins.tuple[Any, ...]`.)
There's one other thing that still troubles me with the latest draft. It shows up in this line: ```python ShapeType = TypeVar('ShapeType', Ndim, Shape) ```
In its strictest mode, pyright complains about both `Ndim` and `Shape` missing type arguments. It's a common mistake for developers to forget type arguments on generic types, so this is an important error. You can fix the problem with `Ndim` by changing it to `Ndim[Any]`, but you can't do the same with `Shape` because `Shape[Any]` would imply a single dimension for the variadic.
The PEP specifies that `Shape` should be interpreted as `Shape[Any, ...]` when the type arguments are omitted, but there's no way to actually write this as a way to indicate "I know what I'm doing, I didn't simply forget to add the type arguments!". The proposal is effectively saying "the type system supports open-ended variadic generics, but the only way to specify them is to use a syntax that we want to discourage and will be considered an error in some type checkers. That strikes me as problematic.
I can see two ways to fix this: 1. We make it illegal to omit type arguments for generic types and type aliases that use variadic type vars. 2. We support the [T, ...] notation for generic types and type aliases that use variadic type vars.
We previously rejected option 1 because it's inconsistent with the precedent set in PEP 484 — and we wanted to allow developers to incrementally add types.
We rejected option 2 because (if I remember correctly) Matthew had some ideas for other ways that the ellipsis could be used for variadic generic classes in the future.
Given our choices, I'd like to push for option 2. I think there are other ways to accommodate future extensions without relying on ellipsis syntax.
If we make this change, we could also bring back support for binding TypeVarTuples to open-ended tuples.
-- Eric Traut Contributor to Pyright & Pylance Microsoft Corp. _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: gohanpra@gmail.com
-- S Pradeep Kumar
On Sat, Feb 20, 2021 at 5:28 PM S Pradeep Kumar <gohanpra@gmail.com> wrote:
I favor `Tensor[int, *Tuple[Any, ...]]` over new syntax like `Tensor[int, ...]` because the former allows for more nuanced types like `Tensor[int, str, *Tuple[Any, ...], T]` whereas the latter doesn't. It's also a clear analogy that we are replacing `Ts` with `Tuple[Any, ...]`.
+1 -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
Hmm, I'm not a fan of `Tensor[*Tuple[Any, ...]`. That's really verbose and inconsistent with existing standards. And it would require all type checkers to add support for open-ended non-homogenous tyes, which would be a heavy lift. Let's save that for another future PEP. I was proposing that we simply extend the PEP 484 tensor convention to all variadic generic types. So this would be `Tensor[Any, ...]`. -- Eric Traut Contributor to Pyright & Pylance Microsoft Corp.
I'm not a fan of `Tensor[T, ...]` because it has confusing semantics. Tensors will usually be typed as: ```python class Tensor(Generic[DType, *Shape]): ... ``` where `Tensor[np.float64, Literal[10], Literal[20]]` represents a 10x20 Tensor where each element has datatype `np.float64`. Let's say we want to represent a Tensor that has arbitrary shape. + `Tensor[np.float64, ...]` looks like it represents a Tensor of datatype np.float64 and arbitrary shape. But by analogy with the similar `Tuple[float, ...]`, it would have to represent any Tensor where all the *dimensions* are `np.float64`: `Tensor[np.float64, np.float64, np.float64]`, etc. That's not what we would expect for a Tensor. + To counter that, we might change the semantics of `Tensor[T, ...]`, to mean datatype T with arbitrary shape (i.e., `Tensor[T, Any, Any, Any]`, etc.), but that will be ambiguous for a class that has multiple unary generics like: ``` class Transformer(Generic[T_in, T_out, *Shape]): ... # Is `T_out` int or Any? x: Transformer[int, ...] ``` New syntax like Transformer[int, str, ...] seems even more confusing. + The only remaining use afaik would be `Tensor[Any, ...]` to represent a Tensor where all parameters are Any. Given that `x: Tensor` already represents this, it doesn't feel worth adding fresh syntax to express the same. In contrast, `Tensor[np.float64, *Tuple[Any, ...]]` expresses the fact that the datatype is `np.float64` but the shape can be anything. `Transformer[A, B, *Tuple[Any, ...]]` works for the other case. We can also state that the dimension types have to be int: `Tensor[np.float64, *Tuple[int, ...]]` to represent `Tensor[np.float64, int, int, int]`, etc. I completely agree that the syntax is verbose and frankly not very pretty :) (TypeScript has a nicer `Tensor[float, ...any[]]` syntax.) That seems fine given that we don't really want to encourage this. Overall, I see two options: + No explicit syntax for unbounded Tensors: We just have `x: Tensor` to support gradual typing as of PEP 646 and we can add unbounded Tensors in follow-up PEPs. Typecheckers would be free to treat `Float32Array = Array[np.float32, *Shape]; x: Float32Array` as `x: Array`, thereby making all parameters `Any`. This seems like the simplest option right now. + Explicit syntax where we substitute `Tuple[Any, ...]` for `Ts`: This would complicate an already heavy PEP, but would allow us to precisely express the fact that the shape can be anything. On Sun, Feb 21, 2021 at 8:28 AM Eric Traut <eric@traut.com> wrote:
Hmm, I'm not a fan of `Tensor[*Tuple[Any, ...]`. That's really verbose and inconsistent with existing standards. And it would require all type checkers to add support for open-ended non-homogenous tyes, which would be a heavy lift. Let's save that for another future PEP.
I was proposing that we simply extend the PEP 484 tensor convention to all variadic generic types. So this would be `Tensor[Any, ...]`.
-- Eric Traut Contributor to Pyright & Pylance Microsoft Corp. _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: gohanpra@gmail.com
-- S Pradeep Kumar
The first option (no explicit syntax) doesn't address my primary concern. It means that there's no way for a type checker to distinguish between "I forgot to add (or haven't gotten around to adding) type arguments" and "I explicitly and intentionally mean `Tensor[Any, ...]`. The second option requires us to expand the syntax and increase the complexity of this PEP. I don't think we should do that either. I understand your point that `[T, ...]` may not make sense for some variadic generic types (except the case where T is `Any`). I guess that doesn't concern me as much as it concerns you. It simply won't be used in ways that don't make sense. I also don't worry so much about users getting confused because they're already familiar with the semantics of "..." when used with tuples. But I understand your points here. As a compromise, how about if we allow the syntax Tensor[T, ...] only if T is `Any`? In other words, `Tensor[np.float64, ...]` would be flagged as an error but `Tensor[Any, ...]` would be accepted.
Eric, responding to your original email:
I noticed that the updated PEP includes a proposal for unparameterized generic type aliases. It currently indicates that a missing type argument for a variadic type var should be interpreted as a zero-length tuple. This interpretation would be inconsistent with every other case where type arguments are omitted, so I don't think that's the right answer.
PEP: IntTuple = Tuple[int, *Ts] As this example shows, all type parameters passed to the alias are bound to the type variable tuple. If no type parameters are given, or if an explicitly empty list of type parameters are given, type variable tuple in the alias is simply ignored:
# Both equivalent to Tuple[int] IntTuple IntTuple[()]
Sorry, I had missed this example in the latest PEP update. I agree that x: IntTuple should not be treated as IntTuple[()]. To be consistent with unparameterized Tensor we should treat `x: IntTuple` as an arbitrary-shaped Tuple, not as `IntTuple[()]`. A generic alias with a free `*Ts` should be treated just as a `class IntTupleClass(BaseClass[int, *Ts], Generic[*Ts])`. In the latter case, x: IntTupleClass would be treated as an arbitrary-shaped variadic. The unparameterized alias should behave the same way.
In my opinion, omitting a type argument for a variadic type parameter should imply `Tuple[Any, ...]`. That's consistent with how `Tuple` works. If the type alias is concatenating other types with the variadic type var into a tuple, then the entire tuple should become a `Tuple[Any ...]`.
I think we agree about this. If we had the *Tuple[Any, ...] syntax we would replace Ts to get Tuple[int, *Tuple[Any, ...]], but since we're not adding this syntax in this PEP, we can keep it simple and just fall back to Tuple[Any, ...] as a whole.
The proposal is effectively saying "the type system supports open-ended variadic generics, but the only way to specify them is to use a syntax that we want to discourage".
That's partially true. The type system supports arbitrary-shaped variadics to the very limited extent that it won't complain about passing Tensor[int, str] to Tensor or vice versa. Even this is only in the PEP to support pre-existing code. A typechecker could comply with the PEP by special-casing unparameterized variadic classes and aliases, without implementing open-ended variadic generics.
The first option (no explicit syntax) doesn't address my primary concern. It means that there's no way for a type checker to distinguish between "I forgot to add (or haven't gotten around to adding) type arguments" and "I explicitly and intentionally mean `Tensor[Any, ...]`.
In its strictest mode, pyright complains about both `Ndim` and `Shape` missing type arguments. It's a common mistake for developers to forget type arguments on generic types, so this is an important error. You can fix the problem with `Ndim` by changing it to `Ndim[Any]`, but you can't do the same with `Shape` because `Shape[Any]` would imply a single dimension for the variadic.
Ah, I see. In your strictest mode, you want to (a) warn users when they inadvertently use an unparameterized Tensor but (b) not warn them when they explicitly denote something as having arbitrary shape. I can't think of a good alternative solution. The problem is that `Tensor[Any, ...]` doesn't seem to be on the road to precise future syntax like `Tensor[np.float64, *Tuple[Any, ...]]`. And we'd have to forbid `Tensor[int, ...]`, `Tensor[T, ...]`, etc. because of the problems mentioned earlier. How about making `Tensor[Any, ...]` a separate PEP? We could bikeshed other possibilities like `Tensor[...]` that disallow using `int`. We could also judge whether the new syntax is worth the niche but important use case. (And we wouldn't have to add typing.py support for it as part of PEP 646) On Mon, Feb 22, 2021 at 12:46 PM Eric Traut <eric@traut.com> wrote:
The first option (no explicit syntax) doesn't address my primary concern. It means that there's no way for a type checker to distinguish between "I forgot to add (or haven't gotten around to adding) type arguments" and "I explicitly and intentionally mean `Tensor[Any, ...]`.
The second option requires us to expand the syntax and increase the complexity of this PEP. I don't think we should do that either.
I understand your point that `[T, ...]` may not make sense for some variadic generic types (except the case where T is `Any`). I guess that doesn't concern me as much as it concerns you. It simply won't be used in ways that don't make sense. I also don't worry so much about users getting confused because they're already familiar with the semantics of "..." when used with tuples. But I understand your points here.
As a compromise, how about if we allow the syntax Tensor[T, ...] only if T is `Any`? In other words, `Tensor[np.float64, ...]` would be flagged as an error but `Tensor[Any, ...]` would be accepted. _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: gohanpra@gmail.com
-- S Pradeep Kumar
I think I'm basically with Pradeep here - the question how to represent arrays with fully or partially arbitrary shapes seems like an issue with a lot of subtleties that we should leave for a future PEP. I feel very strongly about not committing to anything here just yet; for shape typing to be useful in some important examples I have in mind at DeepMind, we're going to need to get this just right.
It means that there's no way for a type checker to distinguish between "I forgot to add (or haven't gotten around to adding) type arguments" and "I explicitly and intentionally mean `Tensor[Any, ...]`.
This therefore unfortunately has to necessarily be the case as this PEP, I reckon. Rectifying this will definitely be a priority for our future PEPs.
As a compromise, how about if we allow the syntax Tensor[T, ...] only if T is `Any`?
Hmm. This is interesting, but I still hesitate - if we *do* decide we want `...` to mean 'This part is arbitrary', then this would prevent us from being able to specify 'An array with an artbirary data type *and* an arbitrary shape' as `Tensor[Any, ...]`. (To emphasise, I'm not saying that we necessarily *should *use '...' to mean 'This part is arbitrary'*, *but only that it seems like a plausible enough option, and a) we shouldn't complicate this PEP by talking about that, and b) we should make this PEP flexible enough to accomodate that potential option.) My opinions about `*Tuple[Any, ...]` have shifted in the past few days, though. It *is* very verbose. *Too *verbose for my liking. My 'emulating a newcomer' narrative is still like "Why do we put the tuple with known contents there and then unpack it? Why not just put the contents there directly? Rrrrrr" So the solution I'm currently leaning towards is the "No explicit syntax" option. We would state that an unparameterized alias `Float32Array` should be considered compatible in both directions with an arbitrary number of type parameters, without stating explicitly how an unparameterized `Float32Array` should be represented, giving us some wiggle room for later. On Tue, 23 Feb 2021 at 02:18, S Pradeep Kumar <gohanpra@gmail.com> wrote:
Eric, responding to your original email:
I noticed that the updated PEP includes a proposal for unparameterized generic type aliases. It currently indicates that a missing type argument for a variadic type var should be interpreted as a zero-length tuple. This interpretation would be inconsistent with every other case where type arguments are omitted, so I don't think that's the right answer.
PEP: IntTuple = Tuple[int, *Ts] As this example shows, all type parameters passed to the alias are bound to the type variable tuple. If no type parameters are given, or if an explicitly empty list of type parameters are given, type variable tuple in the alias is simply ignored:
# Both equivalent to Tuple[int] IntTuple IntTuple[()]
Sorry, I had missed this example in the latest PEP update. I agree that x: IntTuple should not be treated as IntTuple[()].
To be consistent with unparameterized Tensor we should treat `x: IntTuple` as an arbitrary-shaped Tuple, not as `IntTuple[()]`. A generic alias with a free `*Ts` should be treated just as a `class IntTupleClass(BaseClass[int, *Ts], Generic[*Ts])`. In the latter case, x: IntTupleClass would be treated as an arbitrary-shaped variadic. The unparameterized alias should behave the same way.
In my opinion, omitting a type argument for a variadic type parameter should imply `Tuple[Any, ...]`. That's consistent with how `Tuple` works. If the type alias is concatenating other types with the variadic type var into a tuple, then the entire tuple should become a `Tuple[Any ...]`.
I think we agree about this. If we had the *Tuple[Any, ...] syntax we would replace Ts to get Tuple[int, *Tuple[Any, ...]], but since we're not adding this syntax in this PEP, we can keep it simple and just fall back to Tuple[Any, ...] as a whole.
The proposal is effectively saying "the type system supports open-ended variadic generics, but the only way to specify them is to use a syntax that we want to discourage".
That's partially true. The type system supports arbitrary-shaped variadics to the very limited extent that it won't complain about passing Tensor[int, str] to Tensor or vice versa. Even this is only in the PEP to support pre-existing code. A typechecker could comply with the PEP by special-casing unparameterized variadic classes and aliases, without implementing open-ended variadic generics.
The first option (no explicit syntax) doesn't address my primary concern. It means that there's no way for a type checker to distinguish between "I forgot to add (or haven't gotten around to adding) type arguments" and "I explicitly and intentionally mean `Tensor[Any, ...]`.
In its strictest mode, pyright complains about both `Ndim` and `Shape` missing type arguments. It's a common mistake for developers to forget type arguments on generic types, so this is an important error. You can fix the problem with `Ndim` by changing it to `Ndim[Any]`, but you can't do the same with `Shape` because `Shape[Any]` would imply a single dimension for the variadic.
Ah, I see. In your strictest mode, you want to (a) warn users when they inadvertently use an unparameterized Tensor but (b) not warn them when they explicitly denote something as having arbitrary shape.
I can't think of a good alternative solution. The problem is that `Tensor[Any, ...]` doesn't seem to be on the road to precise future syntax like `Tensor[np.float64, *Tuple[Any, ...]]`. And we'd have to forbid `Tensor[int, ...]`, `Tensor[T, ...]`, etc. because of the problems mentioned earlier.
How about making `Tensor[Any, ...]` a separate PEP? We could bikeshed other possibilities like `Tensor[...]` that disallow using `int`. We could also judge whether the new syntax is worth the niche but important use case. (And we wouldn't have to add typing.py support for it as part of PEP 646)
On Mon, Feb 22, 2021 at 12:46 PM Eric Traut <eric@traut.com> wrote:
The first option (no explicit syntax) doesn't address my primary concern. It means that there's no way for a type checker to distinguish between "I forgot to add (or haven't gotten around to adding) type arguments" and "I explicitly and intentionally mean `Tensor[Any, ...]`.
The second option requires us to expand the syntax and increase the complexity of this PEP. I don't think we should do that either.
I understand your point that `[T, ...]` may not make sense for some variadic generic types (except the case where T is `Any`). I guess that doesn't concern me as much as it concerns you. It simply won't be used in ways that don't make sense. I also don't worry so much about users getting confused because they're already familiar with the semantics of "..." when used with tuples. But I understand your points here.
As a compromise, how about if we allow the syntax Tensor[T, ...] only if T is `Any`? In other words, `Tensor[np.float64, ...]` would be flagged as an error but `Tensor[Any, ...]` would be accepted. _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: gohanpra@gmail.com
-- S Pradeep Kumar _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: mrahtz@google.com
Right, with the final couple of updates to the PEP (culminating in https://github.com/python/peps/pull/1859) I think we're done. (Eric, heads up: we made a couple of final semantic changes, detailed in https://github.com/python/peps/pull/1856). Guido, I think we're ready for PEP 646 to be sent to the steering council! What are the next steps? On Tue, 23 Feb 2021 at 20:36, Matthew Rahtz <mrahtz@google.com> wrote:
I think I'm basically with Pradeep here - the question how to represent arrays with fully or partially arbitrary shapes seems like an issue with a lot of subtleties that we should leave for a future PEP. I feel very strongly about not committing to anything here just yet; for shape typing to be useful in some important examples I have in mind at DeepMind, we're going to need to get this just right.
It means that there's no way for a type checker to distinguish between "I forgot to add (or haven't gotten around to adding) type arguments" and "I explicitly and intentionally mean `Tensor[Any, ...]`.
This therefore unfortunately has to necessarily be the case as this PEP, I reckon. Rectifying this will definitely be a priority for our future PEPs.
As a compromise, how about if we allow the syntax Tensor[T, ...] only if T is `Any`?
Hmm. This is interesting, but I still hesitate - if we *do* decide we want `...` to mean 'This part is arbitrary', then this would prevent us from being able to specify 'An array with an artbirary data type *and* an arbitrary shape' as `Tensor[Any, ...]`. (To emphasise, I'm not saying that we necessarily *should *use '...' to mean 'This part is arbitrary'*, *but only that it seems like a plausible enough option, and a) we shouldn't complicate this PEP by talking about that, and b) we should make this PEP flexible enough to accomodate that potential option.)
My opinions about `*Tuple[Any, ...]` have shifted in the past few days, though. It *is* very verbose. *Too *verbose for my liking. My 'emulating a newcomer' narrative is still like "Why do we put the tuple with known contents there and then unpack it? Why not just put the contents there directly? Rrrrrr"
So the solution I'm currently leaning towards is the "No explicit syntax" option. We would state that an unparameterized alias `Float32Array` should be considered compatible in both directions with an arbitrary number of type parameters, without stating explicitly how an unparameterized `Float32Array` should be represented, giving us some wiggle room for later.
On Tue, 23 Feb 2021 at 02:18, S Pradeep Kumar <gohanpra@gmail.com> wrote:
Eric, responding to your original email:
I noticed that the updated PEP includes a proposal for unparameterized generic type aliases. It currently indicates that a missing type argument for a variadic type var should be interpreted as a zero-length tuple. This interpretation would be inconsistent with every other case where type arguments are omitted, so I don't think that's the right answer.
PEP: IntTuple = Tuple[int, *Ts] As this example shows, all type parameters passed to the alias are bound to the type variable tuple. If no type parameters are given, or if an explicitly empty list of type parameters are given, type variable tuple in the alias is simply ignored:
# Both equivalent to Tuple[int] IntTuple IntTuple[()]
Sorry, I had missed this example in the latest PEP update. I agree that x: IntTuple should not be treated as IntTuple[()].
To be consistent with unparameterized Tensor we should treat `x: IntTuple` as an arbitrary-shaped Tuple, not as `IntTuple[()]`. A generic alias with a free `*Ts` should be treated just as a `class IntTupleClass(BaseClass[int, *Ts], Generic[*Ts])`. In the latter case, x: IntTupleClass would be treated as an arbitrary-shaped variadic. The unparameterized alias should behave the same way.
In my opinion, omitting a type argument for a variadic type parameter should imply `Tuple[Any, ...]`. That's consistent with how `Tuple` works. If the type alias is concatenating other types with the variadic type var into a tuple, then the entire tuple should become a `Tuple[Any ...]`.
I think we agree about this. If we had the *Tuple[Any, ...] syntax we would replace Ts to get Tuple[int, *Tuple[Any, ...]], but since we're not adding this syntax in this PEP, we can keep it simple and just fall back to Tuple[Any, ...] as a whole.
The proposal is effectively saying "the type system supports open-ended variadic generics, but the only way to specify them is to use a syntax that we want to discourage".
That's partially true. The type system supports arbitrary-shaped variadics to the very limited extent that it won't complain about passing Tensor[int, str] to Tensor or vice versa. Even this is only in the PEP to support pre-existing code. A typechecker could comply with the PEP by special-casing unparameterized variadic classes and aliases, without implementing open-ended variadic generics.
The first option (no explicit syntax) doesn't address my primary concern. It means that there's no way for a type checker to distinguish between "I forgot to add (or haven't gotten around to adding) type arguments" and "I explicitly and intentionally mean `Tensor[Any, ...]`.
In its strictest mode, pyright complains about both `Ndim` and `Shape` missing type arguments. It's a common mistake for developers to forget type arguments on generic types, so this is an important error. You can fix the problem with `Ndim` by changing it to `Ndim[Any]`, but you can't do the same with `Shape` because `Shape[Any]` would imply a single dimension for the variadic.
Ah, I see. In your strictest mode, you want to (a) warn users when they inadvertently use an unparameterized Tensor but (b) not warn them when they explicitly denote something as having arbitrary shape.
I can't think of a good alternative solution. The problem is that `Tensor[Any, ...]` doesn't seem to be on the road to precise future syntax like `Tensor[np.float64, *Tuple[Any, ...]]`. And we'd have to forbid `Tensor[int, ...]`, `Tensor[T, ...]`, etc. because of the problems mentioned earlier.
How about making `Tensor[Any, ...]` a separate PEP? We could bikeshed other possibilities like `Tensor[...]` that disallow using `int`. We could also judge whether the new syntax is worth the niche but important use case. (And we wouldn't have to add typing.py support for it as part of PEP 646)
On Mon, Feb 22, 2021 at 12:46 PM Eric Traut <eric@traut.com> wrote:
The first option (no explicit syntax) doesn't address my primary concern. It means that there's no way for a type checker to distinguish between "I forgot to add (or haven't gotten around to adding) type arguments" and "I explicitly and intentionally mean `Tensor[Any, ...]`.
The second option requires us to expand the syntax and increase the complexity of this PEP. I don't think we should do that either.
I understand your point that `[T, ...]` may not make sense for some variadic generic types (except the case where T is `Any`). I guess that doesn't concern me as much as it concerns you. It simply won't be used in ways that don't make sense. I also don't worry so much about users getting confused because they're already familiar with the semantics of "..." when used with tuples. But I understand your points here.
As a compromise, how about if we allow the syntax Tensor[T, ...] only if T is `Any`? In other words, `Tensor[np.float64, ...]` would be flagged as an error but `Tensor[Any, ...]` would be accepted. _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: gohanpra@gmail.com
-- S Pradeep Kumar _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: mrahtz@google.com
I'll get back to you about that. I'd like to re-review the entire PEP once more before I recommend sending it to the SC. I'm pretty busy the rest of the week but I hope I'll get to it before Monday. On Wed, Mar 3, 2021 at 12:52 PM Matthew Rahtz <mrahtz@google.com> wrote:
Right, with the final couple of updates to the PEP (culminating in https://github.com/python/peps/pull/1859) I think we're done. (Eric, heads up: we made a couple of final semantic changes, detailed in https://github.com/python/peps/pull/1856).
Guido, I think we're ready for PEP 646 to be sent to the steering council! What are the next steps?
On Tue, 23 Feb 2021 at 20:36, Matthew Rahtz <mrahtz@google.com> wrote:
I think I'm basically with Pradeep here - the question how to represent arrays with fully or partially arbitrary shapes seems like an issue with a lot of subtleties that we should leave for a future PEP. I feel very strongly about not committing to anything here just yet; for shape typing to be useful in some important examples I have in mind at DeepMind, we're going to need to get this just right.
It means that there's no way for a type checker to distinguish between "I forgot to add (or haven't gotten around to adding) type arguments" and "I explicitly and intentionally mean `Tensor[Any, ...]`.
This therefore unfortunately has to necessarily be the case as this PEP, I reckon. Rectifying this will definitely be a priority for our future PEPs.
As a compromise, how about if we allow the syntax Tensor[T, ...] only if T is `Any`?
Hmm. This is interesting, but I still hesitate - if we *do* decide we want `...` to mean 'This part is arbitrary', then this would prevent us from being able to specify 'An array with an artbirary data type *and* an arbitrary shape' as `Tensor[Any, ...]`. (To emphasise, I'm not saying that we necessarily *should *use '...' to mean 'This part is arbitrary'*, *but only that it seems like a plausible enough option, and a) we shouldn't complicate this PEP by talking about that, and b) we should make this PEP flexible enough to accomodate that potential option.)
My opinions about `*Tuple[Any, ...]` have shifted in the past few days, though. It *is* very verbose. *Too *verbose for my liking. My 'emulating a newcomer' narrative is still like "Why do we put the tuple with known contents there and then unpack it? Why not just put the contents there directly? Rrrrrr"
So the solution I'm currently leaning towards is the "No explicit syntax" option. We would state that an unparameterized alias `Float32Array` should be considered compatible in both directions with an arbitrary number of type parameters, without stating explicitly how an unparameterized `Float32Array` should be represented, giving us some wiggle room for later.
On Tue, 23 Feb 2021 at 02:18, S Pradeep Kumar <gohanpra@gmail.com> wrote:
Eric, responding to your original email:
I noticed that the updated PEP includes a proposal for unparameterized generic type aliases. It currently indicates that a missing type argument for a variadic type var should be interpreted as a zero-length tuple. This interpretation would be inconsistent with every other case where type arguments are omitted, so I don't think that's the right answer.
PEP: IntTuple = Tuple[int, *Ts] As this example shows, all type parameters passed to the alias are bound to the type variable tuple. If no type parameters are given, or if an explicitly empty list of type parameters are given, type variable tuple in the alias is simply ignored:
# Both equivalent to Tuple[int] IntTuple IntTuple[()]
Sorry, I had missed this example in the latest PEP update. I agree that x: IntTuple should not be treated as IntTuple[()].
To be consistent with unparameterized Tensor we should treat `x: IntTuple` as an arbitrary-shaped Tuple, not as `IntTuple[()]`. A generic alias with a free `*Ts` should be treated just as a `class IntTupleClass(BaseClass[int, *Ts], Generic[*Ts])`. In the latter case, x: IntTupleClass would be treated as an arbitrary-shaped variadic. The unparameterized alias should behave the same way.
In my opinion, omitting a type argument for a variadic type parameter should imply `Tuple[Any, ...]`. That's consistent with how `Tuple` works. If the type alias is concatenating other types with the variadic type var into a tuple, then the entire tuple should become a `Tuple[Any ...]`.
I think we agree about this. If we had the *Tuple[Any, ...] syntax we would replace Ts to get Tuple[int, *Tuple[Any, ...]], but since we're not adding this syntax in this PEP, we can keep it simple and just fall back to Tuple[Any, ...] as a whole.
The proposal is effectively saying "the type system supports open-ended variadic generics, but the only way to specify them is to use a syntax that we want to discourage".
That's partially true. The type system supports arbitrary-shaped variadics to the very limited extent that it won't complain about passing Tensor[int, str] to Tensor or vice versa. Even this is only in the PEP to support pre-existing code. A typechecker could comply with the PEP by special-casing unparameterized variadic classes and aliases, without implementing open-ended variadic generics.
The first option (no explicit syntax) doesn't address my primary concern. It means that there's no way for a type checker to distinguish between "I forgot to add (or haven't gotten around to adding) type arguments" and "I explicitly and intentionally mean `Tensor[Any, ...]`.
In its strictest mode, pyright complains about both `Ndim` and `Shape` missing type arguments. It's a common mistake for developers to forget type arguments on generic types, so this is an important error. You can fix the problem with `Ndim` by changing it to `Ndim[Any]`, but you can't do the same with `Shape` because `Shape[Any]` would imply a single dimension for the variadic.
Ah, I see. In your strictest mode, you want to (a) warn users when they inadvertently use an unparameterized Tensor but (b) not warn them when they explicitly denote something as having arbitrary shape.
I can't think of a good alternative solution. The problem is that `Tensor[Any, ...]` doesn't seem to be on the road to precise future syntax like `Tensor[np.float64, *Tuple[Any, ...]]`. And we'd have to forbid `Tensor[int, ...]`, `Tensor[T, ...]`, etc. because of the problems mentioned earlier.
How about making `Tensor[Any, ...]` a separate PEP? We could bikeshed other possibilities like `Tensor[...]` that disallow using `int`. We could also judge whether the new syntax is worth the niche but important use case. (And we wouldn't have to add typing.py support for it as part of PEP 646)
On Mon, Feb 22, 2021 at 12:46 PM Eric Traut <eric@traut.com> wrote:
The first option (no explicit syntax) doesn't address my primary concern. It means that there's no way for a type checker to distinguish between "I forgot to add (or haven't gotten around to adding) type arguments" and "I explicitly and intentionally mean `Tensor[Any, ...]`.
The second option requires us to expand the syntax and increase the complexity of this PEP. I don't think we should do that either.
I understand your point that `[T, ...]` may not make sense for some variadic generic types (except the case where T is `Any`). I guess that doesn't concern me as much as it concerns you. It simply won't be used in ways that don't make sense. I also don't worry so much about users getting confused because they're already familiar with the semantics of "..." when used with tuples. But I understand your points here.
As a compromise, how about if we allow the syntax Tensor[T, ...] only if T is `Any`? In other words, `Tensor[np.float64, ...]` would be flagged as an error but `Tensor[Any, ...]` would be accepted. _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: gohanpra@gmail.com
-- S Pradeep Kumar _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: mrahtz@google.com
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
Sure, thanks! On Wed, 3 Mar 2021 at 22:48, Guido van Rossum <guido@python.org> wrote:
I'll get back to you about that. I'd like to re-review the entire PEP once more before I recommend sending it to the SC. I'm pretty busy the rest of the week but I hope I'll get to it before Monday.
On Wed, Mar 3, 2021 at 12:52 PM Matthew Rahtz <mrahtz@google.com> wrote:
Right, with the final couple of updates to the PEP (culminating in https://github.com/python/peps/pull/1859) I think we're done. (Eric, heads up: we made a couple of final semantic changes, detailed in https://github.com/python/peps/pull/1856).
Guido, I think we're ready for PEP 646 to be sent to the steering council! What are the next steps?
On Tue, 23 Feb 2021 at 20:36, Matthew Rahtz <mrahtz@google.com> wrote:
I think I'm basically with Pradeep here - the question how to represent arrays with fully or partially arbitrary shapes seems like an issue with a lot of subtleties that we should leave for a future PEP. I feel very strongly about not committing to anything here just yet; for shape typing to be useful in some important examples I have in mind at DeepMind, we're going to need to get this just right.
It means that there's no way for a type checker to distinguish between "I forgot to add (or haven't gotten around to adding) type arguments" and "I explicitly and intentionally mean `Tensor[Any, ...]`.
This therefore unfortunately has to necessarily be the case as this PEP, I reckon. Rectifying this will definitely be a priority for our future PEPs.
As a compromise, how about if we allow the syntax Tensor[T, ...] only if T is `Any`?
Hmm. This is interesting, but I still hesitate - if we *do* decide we want `...` to mean 'This part is arbitrary', then this would prevent us from being able to specify 'An array with an artbirary data type *and* an arbitrary shape' as `Tensor[Any, ...]`. (To emphasise, I'm not saying that we necessarily *should *use '...' to mean 'This part is arbitrary'*, *but only that it seems like a plausible enough option, and a) we shouldn't complicate this PEP by talking about that, and b) we should make this PEP flexible enough to accomodate that potential option.)
My opinions about `*Tuple[Any, ...]` have shifted in the past few days, though. It *is* very verbose. *Too *verbose for my liking. My 'emulating a newcomer' narrative is still like "Why do we put the tuple with known contents there and then unpack it? Why not just put the contents there directly? Rrrrrr"
So the solution I'm currently leaning towards is the "No explicit syntax" option. We would state that an unparameterized alias `Float32Array` should be considered compatible in both directions with an arbitrary number of type parameters, without stating explicitly how an unparameterized `Float32Array` should be represented, giving us some wiggle room for later.
On Tue, 23 Feb 2021 at 02:18, S Pradeep Kumar <gohanpra@gmail.com> wrote:
Eric, responding to your original email:
I noticed that the updated PEP includes a proposal for unparameterized generic type aliases. It currently indicates that a missing type argument for a variadic type var should be interpreted as a zero-length tuple. This interpretation would be inconsistent with every other case where type arguments are omitted, so I don't think that's the right answer.
PEP: IntTuple = Tuple[int, *Ts] As this example shows, all type parameters passed to the alias are bound to the type variable tuple. If no type parameters are given, or if an explicitly empty list of type parameters are given, type variable tuple in the alias is simply ignored:
# Both equivalent to Tuple[int] IntTuple IntTuple[()]
Sorry, I had missed this example in the latest PEP update. I agree that x: IntTuple should not be treated as IntTuple[()].
To be consistent with unparameterized Tensor we should treat `x: IntTuple` as an arbitrary-shaped Tuple, not as `IntTuple[()]`. A generic alias with a free `*Ts` should be treated just as a `class IntTupleClass(BaseClass[int, *Ts], Generic[*Ts])`. In the latter case, x: IntTupleClass would be treated as an arbitrary-shaped variadic. The unparameterized alias should behave the same way.
In my opinion, omitting a type argument for a variadic type parameter should imply `Tuple[Any, ...]`. That's consistent with how `Tuple` works. If the type alias is concatenating other types with the variadic type var into a tuple, then the entire tuple should become a `Tuple[Any ...]`.
I think we agree about this. If we had the *Tuple[Any, ...] syntax we would replace Ts to get Tuple[int, *Tuple[Any, ...]], but since we're not adding this syntax in this PEP, we can keep it simple and just fall back to Tuple[Any, ...] as a whole.
The proposal is effectively saying "the type system supports open-ended variadic generics, but the only way to specify them is to use a syntax that we want to discourage".
That's partially true. The type system supports arbitrary-shaped variadics to the very limited extent that it won't complain about passing Tensor[int, str] to Tensor or vice versa. Even this is only in the PEP to support pre-existing code. A typechecker could comply with the PEP by special-casing unparameterized variadic classes and aliases, without implementing open-ended variadic generics.
The first option (no explicit syntax) doesn't address my primary concern. It means that there's no way for a type checker to distinguish between "I forgot to add (or haven't gotten around to adding) type arguments" and "I explicitly and intentionally mean `Tensor[Any, ...]`.
In its strictest mode, pyright complains about both `Ndim` and `Shape` missing type arguments. It's a common mistake for developers to forget type arguments on generic types, so this is an important error. You can fix the problem with `Ndim` by changing it to `Ndim[Any]`, but you can't do the same with `Shape` because `Shape[Any]` would imply a single dimension for the variadic.
Ah, I see. In your strictest mode, you want to (a) warn users when they inadvertently use an unparameterized Tensor but (b) not warn them when they explicitly denote something as having arbitrary shape.
I can't think of a good alternative solution. The problem is that `Tensor[Any, ...]` doesn't seem to be on the road to precise future syntax like `Tensor[np.float64, *Tuple[Any, ...]]`. And we'd have to forbid `Tensor[int, ...]`, `Tensor[T, ...]`, etc. because of the problems mentioned earlier.
How about making `Tensor[Any, ...]` a separate PEP? We could bikeshed other possibilities like `Tensor[...]` that disallow using `int`. We could also judge whether the new syntax is worth the niche but important use case. (And we wouldn't have to add typing.py support for it as part of PEP 646)
On Mon, Feb 22, 2021 at 12:46 PM Eric Traut <eric@traut.com> wrote:
The first option (no explicit syntax) doesn't address my primary concern. It means that there's no way for a type checker to distinguish between "I forgot to add (or haven't gotten around to adding) type arguments" and "I explicitly and intentionally mean `Tensor[Any, ...]`.
The second option requires us to expand the syntax and increase the complexity of this PEP. I don't think we should do that either.
I understand your point that `[T, ...]` may not make sense for some variadic generic types (except the case where T is `Any`). I guess that doesn't concern me as much as it concerns you. It simply won't be used in ways that don't make sense. I also don't worry so much about users getting confused because they're already familiar with the semantics of "..." when used with tuples. But I understand your points here.
As a compromise, how about if we allow the syntax Tensor[T, ...] only if T is `Any`? In other words, `Tensor[np.float64, ...]` would be flagged as an error but `Tensor[Any, ...]` would be accepted. _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: gohanpra@gmail.com
-- S Pradeep Kumar _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: mrahtz@google.com
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
FWIW, I just re-read the whole PEP (finally!), and iI think it's excellent. All the important design points are clearly discussed in the main text. After tomorrow's meeting, unless significant issues are brought up there, we should submit it to the Steering Council's tracker ( https://github.com/python/steering-council/issues). Special kudos to Matthew as the primary author and editor, and to Pradeep and Eric for implementing the design in two different checkers and providing lots of feedback based on their experiences. See you all tomorrow in the Meetup. --Guido On Thu, Mar 4, 2021 at 2:17 AM Matthew Rahtz <mrahtz@google.com> wrote:
Sure, thanks!
On Wed, 3 Mar 2021 at 22:48, Guido van Rossum <guido@python.org> wrote:
I'll get back to you about that. I'd like to re-review the entire PEP once more before I recommend sending it to the SC. I'm pretty busy the rest of the week but I hope I'll get to it before Monday.
On Wed, Mar 3, 2021 at 12:52 PM Matthew Rahtz <mrahtz@google.com> wrote:
Right, with the final couple of updates to the PEP (culminating in https://github.com/python/peps/pull/1859) I think we're done. (Eric, heads up: we made a couple of final semantic changes, detailed in https://github.com/python/peps/pull/1856).
Guido, I think we're ready for PEP 646 to be sent to the steering council! What are the next steps?
On Tue, 23 Feb 2021 at 20:36, Matthew Rahtz <mrahtz@google.com> wrote:
I think I'm basically with Pradeep here - the question how to represent arrays with fully or partially arbitrary shapes seems like an issue with a lot of subtleties that we should leave for a future PEP. I feel very strongly about not committing to anything here just yet; for shape typing to be useful in some important examples I have in mind at DeepMind, we're going to need to get this just right.
It means that there's no way for a type checker to distinguish between "I forgot to add (or haven't gotten around to adding) type arguments" and "I explicitly and intentionally mean `Tensor[Any, ...]`.
This therefore unfortunately has to necessarily be the case as this PEP, I reckon. Rectifying this will definitely be a priority for our future PEPs.
As a compromise, how about if we allow the syntax Tensor[T, ...] only if T is `Any`?
Hmm. This is interesting, but I still hesitate - if we *do* decide we want `...` to mean 'This part is arbitrary', then this would prevent us from being able to specify 'An array with an artbirary data type *and* an arbitrary shape' as `Tensor[Any, ...]`. (To emphasise, I'm not saying that we necessarily *should *use '...' to mean 'This part is arbitrary'*, *but only that it seems like a plausible enough option, and a) we shouldn't complicate this PEP by talking about that, and b) we should make this PEP flexible enough to accomodate that potential option.)
My opinions about `*Tuple[Any, ...]` have shifted in the past few days, though. It *is* very verbose. *Too *verbose for my liking. My 'emulating a newcomer' narrative is still like "Why do we put the tuple with known contents there and then unpack it? Why not just put the contents there directly? Rrrrrr"
So the solution I'm currently leaning towards is the "No explicit syntax" option. We would state that an unparameterized alias `Float32Array` should be considered compatible in both directions with an arbitrary number of type parameters, without stating explicitly how an unparameterized `Float32Array` should be represented, giving us some wiggle room for later.
On Tue, 23 Feb 2021 at 02:18, S Pradeep Kumar <gohanpra@gmail.com> wrote:
Eric, responding to your original email:
I noticed that the updated PEP includes a proposal for unparameterized generic type aliases. It currently indicates that a missing type argument for a variadic type var should be interpreted as a zero-length tuple. This interpretation would be inconsistent with every other case where type arguments are omitted, so I don't think that's the right answer.
PEP: IntTuple = Tuple[int, *Ts] As this example shows, all type parameters passed to the alias are bound to the type variable tuple. If no type parameters are given, or if an explicitly empty list of type parameters are given, type variable tuple in the alias is simply ignored:
# Both equivalent to Tuple[int] IntTuple IntTuple[()]
Sorry, I had missed this example in the latest PEP update. I agree that x: IntTuple should not be treated as IntTuple[()].
To be consistent with unparameterized Tensor we should treat `x: IntTuple` as an arbitrary-shaped Tuple, not as `IntTuple[()]`. A generic alias with a free `*Ts` should be treated just as a `class IntTupleClass(BaseClass[int, *Ts], Generic[*Ts])`. In the latter case, x: IntTupleClass would be treated as an arbitrary-shaped variadic. The unparameterized alias should behave the same way.
In my opinion, omitting a type argument for a variadic type parameter should imply `Tuple[Any, ...]`. That's consistent with how `Tuple` works. If the type alias is concatenating other types with the variadic type var into a tuple, then the entire tuple should become a `Tuple[Any ...]`.
I think we agree about this. If we had the *Tuple[Any, ...] syntax we would replace Ts to get Tuple[int, *Tuple[Any, ...]], but since we're not adding this syntax in this PEP, we can keep it simple and just fall back to Tuple[Any, ...] as a whole.
The proposal is effectively saying "the type system supports open-ended variadic generics, but the only way to specify them is to use a syntax that we want to discourage".
That's partially true. The type system supports arbitrary-shaped variadics to the very limited extent that it won't complain about passing Tensor[int, str] to Tensor or vice versa. Even this is only in the PEP to support pre-existing code. A typechecker could comply with the PEP by special-casing unparameterized variadic classes and aliases, without implementing open-ended variadic generics.
The first option (no explicit syntax) doesn't address my primary concern. It means that there's no way for a type checker to distinguish between "I forgot to add (or haven't gotten around to adding) type arguments" and "I explicitly and intentionally mean `Tensor[Any, ...]`.
In its strictest mode, pyright complains about both `Ndim` and `Shape` missing type arguments. It's a common mistake for developers to forget type arguments on generic types, so this is an important error. You can fix the problem with `Ndim` by changing it to `Ndim[Any]`, but you can't do the same with `Shape` because `Shape[Any]` would imply a single dimension for the variadic.
Ah, I see. In your strictest mode, you want to (a) warn users when they inadvertently use an unparameterized Tensor but (b) not warn them when they explicitly denote something as having arbitrary shape.
I can't think of a good alternative solution. The problem is that `Tensor[Any, ...]` doesn't seem to be on the road to precise future syntax like `Tensor[np.float64, *Tuple[Any, ...]]`. And we'd have to forbid `Tensor[int, ...]`, `Tensor[T, ...]`, etc. because of the problems mentioned earlier.
How about making `Tensor[Any, ...]` a separate PEP? We could bikeshed other possibilities like `Tensor[...]` that disallow using `int`. We could also judge whether the new syntax is worth the niche but important use case. (And we wouldn't have to add typing.py support for it as part of PEP 646)
On Mon, Feb 22, 2021 at 12:46 PM Eric Traut <eric@traut.com> wrote:
The first option (no explicit syntax) doesn't address my primary concern. It means that there's no way for a type checker to distinguish between "I forgot to add (or haven't gotten around to adding) type arguments" and "I explicitly and intentionally mean `Tensor[Any, ...]`.
The second option requires us to expand the syntax and increase the complexity of this PEP. I don't think we should do that either.
I understand your point that `[T, ...]` may not make sense for some variadic generic types (except the case where T is `Any`). I guess that doesn't concern me as much as it concerns you. It simply won't be used in ways that don't make sense. I also don't worry so much about users getting confused because they're already familiar with the semantics of "..." when used with tuples. But I understand your points here.
As a compromise, how about if we allow the syntax Tensor[T, ...] only if T is `Any`? In other words, `Tensor[np.float64, ...]` would be flagged as an error but `Tensor[Any, ...]` would be accepted. _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: gohanpra@gmail.com
-- S Pradeep Kumar _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: mrahtz@google.com
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
Thanks, Guido - fantastic news! See you later at the meetup. On Mon, 15 Mar 2021 at 03:36, Guido van Rossum <guido@python.org> wrote:
FWIW, I just re-read the whole PEP (finally!), and iI think it's excellent. All the important design points are clearly discussed in the main text. After tomorrow's meeting, unless significant issues are brought up there, we should submit it to the Steering Council's tracker ( https://github.com/python/steering-council/issues).
Special kudos to Matthew as the primary author and editor, and to Pradeep and Eric for implementing the design in two different checkers and providing lots of feedback based on their experiences.
See you all tomorrow in the Meetup.
--Guido
On Thu, Mar 4, 2021 at 2:17 AM Matthew Rahtz <mrahtz@google.com> wrote:
Sure, thanks!
On Wed, 3 Mar 2021 at 22:48, Guido van Rossum <guido@python.org> wrote:
I'll get back to you about that. I'd like to re-review the entire PEP once more before I recommend sending it to the SC. I'm pretty busy the rest of the week but I hope I'll get to it before Monday.
On Wed, Mar 3, 2021 at 12:52 PM Matthew Rahtz <mrahtz@google.com> wrote:
Right, with the final couple of updates to the PEP (culminating in https://github.com/python/peps/pull/1859) I think we're done. (Eric, heads up: we made a couple of final semantic changes, detailed in https://github.com/python/peps/pull/1856).
Guido, I think we're ready for PEP 646 to be sent to the steering council! What are the next steps?
On Tue, 23 Feb 2021 at 20:36, Matthew Rahtz <mrahtz@google.com> wrote:
I think I'm basically with Pradeep here - the question how to represent arrays with fully or partially arbitrary shapes seems like an issue with a lot of subtleties that we should leave for a future PEP. I feel very strongly about not committing to anything here just yet; for shape typing to be useful in some important examples I have in mind at DeepMind, we're going to need to get this just right.
It means that there's no way for a type checker to distinguish between "I forgot to add (or haven't gotten around to adding) type arguments" and "I explicitly and intentionally mean `Tensor[Any, ...]`.
This therefore unfortunately has to necessarily be the case as this PEP, I reckon. Rectifying this will definitely be a priority for our future PEPs.
As a compromise, how about if we allow the syntax Tensor[T, ...] only if T is `Any`?
Hmm. This is interesting, but I still hesitate - if we *do* decide we want `...` to mean 'This part is arbitrary', then this would prevent us from being able to specify 'An array with an artbirary data type *and* an arbitrary shape' as `Tensor[Any, ...]`. (To emphasise, I'm not saying that we necessarily *should *use '...' to mean 'This part is arbitrary'*, *but only that it seems like a plausible enough option, and a) we shouldn't complicate this PEP by talking about that, and b) we should make this PEP flexible enough to accomodate that potential option.)
My opinions about `*Tuple[Any, ...]` have shifted in the past few days, though. It *is* very verbose. *Too *verbose for my liking. My 'emulating a newcomer' narrative is still like "Why do we put the tuple with known contents there and then unpack it? Why not just put the contents there directly? Rrrrrr"
So the solution I'm currently leaning towards is the "No explicit syntax" option. We would state that an unparameterized alias `Float32Array` should be considered compatible in both directions with an arbitrary number of type parameters, without stating explicitly how an unparameterized `Float32Array` should be represented, giving us some wiggle room for later.
On Tue, 23 Feb 2021 at 02:18, S Pradeep Kumar <gohanpra@gmail.com> wrote:
Eric, responding to your original email:
> I noticed that the updated PEP includes a proposal for unparameterized generic type aliases. It currently indicates that a missing type argument for a variadic type var should be interpreted as a zero-length tuple. This interpretation would be inconsistent with every other case where type arguments are omitted, so I don't think that's the right answer.
> PEP: > IntTuple = Tuple[int, *Ts] > As this example shows, all type parameters passed to the alias are bound to the type variable tuple. If no type parameters are given, or if an explicitly empty list of type parameters are given, type variable tuple in the alias is simply ignored: > > # Both equivalent to Tuple[int] > IntTuple > IntTuple[()]
Sorry, I had missed this example in the latest PEP update. I agree that x: IntTuple should not be treated as IntTuple[()].
To be consistent with unparameterized Tensor we should treat `x: IntTuple` as an arbitrary-shaped Tuple, not as `IntTuple[()]`. A generic alias with a free `*Ts` should be treated just as a `class IntTupleClass(BaseClass[int, *Ts], Generic[*Ts])`. In the latter case, x: IntTupleClass would be treated as an arbitrary-shaped variadic. The unparameterized alias should behave the same way.
> In my opinion, omitting a type argument for a variadic type parameter should imply `Tuple[Any, ...]`. That's consistent with how `Tuple` works. > If the type alias is concatenating other types with the variadic type var into a tuple, then the entire tuple should become a `Tuple[Any ...]`.
I think we agree about this. If we had the *Tuple[Any, ...] syntax we would replace Ts to get Tuple[int, *Tuple[Any, ...]], but since we're not adding this syntax in this PEP, we can keep it simple and just fall back to Tuple[Any, ...] as a whole.
> The proposal is effectively saying "the type system supports open-ended variadic generics, but the only way to specify them is to use a syntax that we want to discourage".
That's partially true. The type system supports arbitrary-shaped variadics to the very limited extent that it won't complain about passing Tensor[int, str] to Tensor or vice versa. Even this is only in the PEP to support pre-existing code. A typechecker could comply with the PEP by special-casing unparameterized variadic classes and aliases, without implementing open-ended variadic generics.
> The first option (no explicit syntax) doesn't address my primary concern. It means that there's no way for a type checker to distinguish between "I forgot to add (or haven't gotten around to adding) type arguments" and "I explicitly and intentionally mean `Tensor[Any, ...]`.
> In its strictest mode, pyright complains about both `Ndim` and `Shape` missing type arguments. It's a common mistake for developers to forget type arguments on generic types, so this is an important error. You can fix the problem with `Ndim` by changing it to `Ndim[Any]`, but you can't do the same with `Shape` because `Shape[Any]` would imply a single dimension for the variadic.
Ah, I see. In your strictest mode, you want to (a) warn users when they inadvertently use an unparameterized Tensor but (b) not warn them when they explicitly denote something as having arbitrary shape.
I can't think of a good alternative solution. The problem is that `Tensor[Any, ...]` doesn't seem to be on the road to precise future syntax like `Tensor[np.float64, *Tuple[Any, ...]]`. And we'd have to forbid `Tensor[int, ...]`, `Tensor[T, ...]`, etc. because of the problems mentioned earlier.
How about making `Tensor[Any, ...]` a separate PEP? We could bikeshed other possibilities like `Tensor[...]` that disallow using `int`. We could also judge whether the new syntax is worth the niche but important use case. (And we wouldn't have to add typing.py support for it as part of PEP 646)
On Mon, Feb 22, 2021 at 12:46 PM Eric Traut <eric@traut.com> wrote:
> The first option (no explicit syntax) doesn't address my primary > concern. It means that there's no way for a type checker to distinguish > between "I forgot to add (or haven't gotten around to adding) type > arguments" and "I explicitly and intentionally mean `Tensor[Any, ...]`. > > The second option requires us to expand the syntax and increase the > complexity of this PEP. I don't think we should do that either. > > I understand your point that `[T, ...]` may not make sense for some > variadic generic types (except the case where T is `Any`). I guess that > doesn't concern me as much as it concerns you. It simply won't be used in > ways that don't make sense. I also don't worry so much about users getting > confused because they're already familiar with the semantics of "..." when > used with tuples. But I understand your points here. > > As a compromise, how about if we allow the syntax Tensor[T, ...] > only if T is `Any`? In other words, `Tensor[np.float64, ...]` would be > flagged as an error but `Tensor[Any, ...]` would be accepted. > _______________________________________________ > Typing-sig mailing list -- typing-sig@python.org > To unsubscribe send an email to typing-sig-leave@python.org > https://mail.python.org/mailman3/lists/typing-sig.python.org/ > Member address: gohanpra@gmail.com >
-- S Pradeep Kumar _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: mrahtz@google.com
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
Hi, Sorry for the noise, but I just wanted to be sure. I have a fairly simple use-case: I want to essentially alias Tuple (as in: SpecialTuple = Tuple) and be able to tell that type apart at runtime. My users would be able to do: def some_function() -> SpecialTuple[int, str]: return 0, 'hi!' def some_other_function() -> SpecialTuple[int, int, int]: return 1, 2, 3 And I would be able to know at runtime, by looking at the type annotations, that the return value is special and I need to do some custom handling. As far as type checkers would be concerned, SpecialTuple is just a Tuple. As far as I can tell, this should be possible with the current PEP by doing the following: _Ts = TypeVarTuple('_Ts') SpecialTuple = Tuple[*_TS] Correct? Thank you, Filipe Laíns
Filipe - yes, that's right, for your use case `SpecialTuple = Tuple[*Ts]` should work - as long as you don't care about doing e.g. `SpecialTuple[int, ...]`, which currently deliberately doesn't work. On Sun, 21 Feb 2021 at 05:42, Filipe Laíns <lains@riseup.net> wrote:
Hi,
Sorry for the noise, but I just wanted to be sure. I have a fairly simple use-case: I want to essentially alias Tuple (as in: SpecialTuple = Tuple) and be able to tell that type apart at runtime.
My users would be able to do:
def some_function() -> SpecialTuple[int, str]: return 0, 'hi!'
def some_other_function() -> SpecialTuple[int, int, int]: return 1, 2, 3
And I would be able to know at runtime, by looking at the type annotations, that the return value is special and I need to do some custom handling. As far as type checkers would be concerned, SpecialTuple is just a Tuple.
As far as I can tell, this should be possible with the current PEP by doing the following:
_Ts = TypeVarTuple('_Ts') SpecialTuple = Tuple[*_TS]
Correct?
Thank you, Filipe Laíns _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: mrahtz@google.com
On Sun, 2021-02-21 at 10:42 +0000, Matthew Rahtz wrote:
Filipe - yes, that's right, for your use case `SpecialTuple = Tuple[*Ts]` should work - as long as you don't care about doing e.g. `SpecialTuple[int, ...]`, which currently deliberately doesn't work.
Yeah, I got that from the PEP. Fortunately, that is not an issue for me as the data I am trying to encode can never be that (my use-case is generating D-Bus signatures by looking at the type annotations btw), though I imagine that can be a limitation for some people. Thanks! Filipe Laíns
I ran all of the samples through pyright and uncovered a few small errors. I pushed a PR that fixes these, and Guido already approved and merged the change.
Thanks, Eric, this is super helpful!
Ah, thanks for noticing those, Eric. We had fixed some of these issues in our running Google Doc but not yet updated the PR. Will try to keep the PR more up-to-date.
+1; my bad.
A variadic class without parameters will bind any `Ts` to `Tuple[Any, ...]`.
So, in the above example, `x: Float32Array` will resolve to `Array[np.float32, *Tuple[Any, ...]]`.
This sounds reasonable to me too.
There's one other thing that still troubles me with the latest draft. ... In its strictest mode, pyright complains about both `Ndim` and `Shape` missing type arguments.
Ah, also a great point. Having thought about this a bit more, I'm actually wondering whether we can just drop that part of the PEP altogether. Dan Moldovan pointed out that maybe the example wouldn't make sense anyway, since an ` Array[Ndim[Literal[1]]]` wouldn't be compatible with an `Array[Shape[Any]]`, and that seems silly. So I think we should just get rid of `Ndim`, use a sequence of `Any` instead, and not support crazy things like `ShapeType = TypeVar('ShapeType', Ndim, Shape)` where we somehow have to support forwarding the type parameter to the type parameter lists of the bounds argghhh so complicated. That would leave us with just: ``` DataType = TypeVar('DataType') Shape = TypeVarTuple('Shape') class Array(Generic[DataType, *Shape]): ... Float32Array = Array[np.float32, *Shape] Array1D = Array[DataType, Any] Array2D = Array[DataType, Any, Any] # etc. ``` On Sun, 21 Feb 2021 at 10:42, Matthew Rahtz <mrahtz@google.com> wrote:
Filipe - yes, that's right, for your use case `SpecialTuple = Tuple[*Ts]` should work - as long as you don't care about doing e.g. `SpecialTuple[int, ...]`, which currently deliberately doesn't work.
On Sun, 21 Feb 2021 at 05:42, Filipe Laíns <lains@riseup.net> wrote:
Hi,
Sorry for the noise, but I just wanted to be sure. I have a fairly simple use-case: I want to essentially alias Tuple (as in: SpecialTuple = Tuple) and be able to tell that type apart at runtime.
My users would be able to do:
def some_function() -> SpecialTuple[int, str]: return 0, 'hi!'
def some_other_function() -> SpecialTuple[int, int, int]: return 1, 2, 3
And I would be able to know at runtime, by looking at the type annotations, that the return value is special and I need to do some custom handling. As far as type checkers would be concerned, SpecialTuple is just a Tuple.
As far as I can tell, this should be possible with the current PEP by doing the following:
_Ts = TypeVarTuple('_Ts') SpecialTuple = Tuple[*_TS]
Correct?
Thank you, Filipe Laíns _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: mrahtz@google.com
Aliasing of `Tuple` in the way you've suggested probably won't be possible any time soon. This would require additional work in type checkers because `Tuple` is handled as a very specialized case in the code today. This would also require us to update the `tuple` class definition in builtins.pyi so it is parameterized by a TupleTypeVar. Currently, it is not, and we may need to wait for all Python type checkers to support PEP 646 before we can do this. Given that some of the major type checkers still haven't added support for Python 3.9 PEPs (604, 612, 613, 614) it could be a long time before PEP 646 is universally supported. -- Eric Traut Contributor to Pyright & Pylance Microsoft Corp.
participants (11)
-
Alfonso L. Castaño
-
David Foster
-
Eric Traut
-
Filipe Laíns
-
grievejia@fb.com
-
Guido van Rossum
-
Matthew Rahtz
-
Naomi Seyfer
-
Peilonrayz
-
Rebecca Chen
-
S Pradeep Kumar