Replying to Matthew's question:
(**Jukka, Ivan, Rebecca**, it would be super useful to hear your thoughts
here too. In case a catch-up would be helpful: the question is, if we wanted to make it so that our new 'variadic type variable' was used as `Ts = TypeVarTuple('Ts'); class Foo(Generic[*Ts]): ...; foo: Foo[int, str] = Foo()`, how hard would that be, considering the new use of the star?)
For pytype in particular, this wouldn't be too bad. We use a Python interpreter in the target Python version to compile source code to bytecode and then analyze the bytecode, so grammar changes aren't any harder to accommodate than other types of changes. (pytype does have a separate stub parser that uses typed_ast at the moment, and we need to instead use the ast module in Python 3.8+ to parse newer syntax, but that's a change we need to make anyway and not specific to this PEP.)
On Mon, Jan 25, 2021 at 10:58 AM S Pradeep Kumar firstname.lastname@example.org wrote:
Can you clarify what "no concatenation of variadics" refers to? Does
this mean we can't (yet) have `Tuple[int, *Ts]`? Or is that specifically about `Tuple[*Ts1, *Ts2]`. (And what about the same constructs inside `Callable[[<here>], R]`?
Guido: I mean concatenation of multiple variadics (`Tuple[*Ts, *Ts2]`, same within Callable).
The first PEP will support concatenation with a non-variadic prefix and suffix (`Tuple[int, *Ts, str]`, same within Callable).
But I've been learning about the corresponding functionality in
I strongly agree. `*Ts` is a clear visual analogy to tuple unpacking.
# Reference implementation and examples
I wonder if any of the PEP authors have worked on an implementation yet
Yes, I have been working on a reference implementation from scratch in Pyre.
I added support over the weekend for concatenation of multiple variadics since that's one of the main points of uncertainty in our discussion. Here are some examples that typecheck on my branch:
(1) Basic usage of non-variadic prefix and suffix
from typing import List, Tuple, TypeVar Ts = TypeVar("Ts", bound=tuple) Ts2 = TypeVar("Ts2", bound=tuple) def strip_both_sides(x: Tuple[int, *Ts, str]) -> Ts: ... def add_int(x: Tuple[*Ts]) -> Tuple[bool, *Ts, int]: ... def foo(xs: Tuple[int, bool, str]) -> None: z = add_bool_int(strip_both_sides(xs)) # => Tuple[bool, bool, int] reveal_type(z)
(2) `partial` (using concatenation of multiple variadics):
from typing import Callable, Tuple, TypeVar Ts = TypeVar("Ts", bound=tuple) Ts2 = TypeVar("Ts2", bound=tuple) def partial(f: Callable[[*Ts, *Ts2], bool], *args: *Ts) -> Callable[[*Ts2], bool]: ... def foo(x: int, y: str, z: bool) -> bool: ... def baz() -> None: expects_bool = partial(foo, 1, "hello") # Valid. expects_bool(True) # Error. expects_bool() # Note that this `partial` doesn't support keyword arguments.
Other notable features:
- Variadic classes like Tensor.
- `*args: *Ts`
- Concatenation within parameters of Tuples, Callables, and variadic
- Allowing Tuples to be arbitrarily unpacked: `*Tuple[int, *Ts]`
Once we settle the debate about TypeVar vs TypeVarTuple and other syntax, I can work on merging this branch into Pyre master and implementing some other details from the PEP (such as `Union[*Ts]`, generic aliases, Map with generic aliases, etc.).
# Key findings
(1) Prefix and suffix work as expected
Handling a suffix of non-variadic parameters `Tuple[*Ts, int]` is essentially the same work as handling a prefix `Tuple[int, *Ts]`.
@Eric Could you share an example of a non-variadic suffix that you felt would be hard to tackle?
I'm basically following the TypeScript algorithm for inferring one variadic tuple type against another ( https://github.com/microsoft/TypeScript/pull/39094).
(2) Concatenation of multiple variadics
Concatenation of multiple variadics works as expected when the length of one is unambiguously known.
However, concatenation of multiple variadics is indeed significantly complex and should definitely *not* be part of the initial variadics PEP.
(3) Unpack[Ts] / *Ts didn't pose a big problem
I preprocessed `*Ts` to be `Unpack[Ts]` so that I could uniformly deal with `Unpack`. Pyre uses its own parser, so that's one thing to note.
Also, the `*` syntax is going to be implemented as part of PEP 637, so I don't think the implementation should be a concern for us. Until it is implemented, we do have `Unpack`.
(4) Surprising behavior: variance of Tuple vs Tensor
def foo(xs: Tuple[*Ts], ys: Tuple[*Ts]) -> Tuple[*Ts]: ... tuple_int_str: Tuple[int, str] tuple_int_bool: Tuple[int, bool] foo(tuple_int_str, tuple_int_bool)
We might expect a type error here because `Tuple[int, str] != Tuple[int, bool]`. However, the typechecker actually infers `Ts = Tuple[int, Union[str, bool]]`, which is a perfectly valid solution. This is sound but unintuitive.
This is analogous to what we do for the non-variadic case:
def foo(xs: T, ys: T) -> T: ... some_int: int some_str: str foo(some_int, some_bool)
This is not a type error because `T = Union[int, str]` is a valid solution for T. (Mypy infers `T = object`, but it's the same principle.)
Note, however, that we *do* raise an error with Tensor:
def bar(x: Tensor[*Ts], y: Tensor[*Ts]) -> Tensor[*Ts]: ... xs: Tensor[int, str] ys: Tensor[bool, bool] # Error bar(xs, ys)
This is because Tensor is invariant whereas Tuple is covariant.
(FWIW, TypeScript doesn't do the above for either the variadic or non-variadic case. It raises an error if T is passed different types of arguments.)
The Tuple covariance means we don't error in cases where we would expect to. Perhaps we can define variadic tuples as invariant by default.
(5) What if there is an arity mismatch?
Consider the following case (the same case as Eric pointed out :) ).
def foo(xs: Tuple[*Ts], ys: Tuple[*Ts]) -> Tuple[*Ts]: ... tuple_int_str: Tuple[int, str] tuple_bool: Tuple[bool] foo(tuple_int_str, tuple_bool)
We might expect this to error because the argument types have different lengths.
However, Ts = Union[Tuple[int, str], Tuple[bool]] is a valid solution, since `Tuple` is covariant.
`foo` gets treated as:
def foo(xs: Union[Tuple[int, str], Tuple[bool]], ys: Union[Tuple[int, str], Tuple[bool]]) -> Union[Tuple[int, str], Tuple[bool]]: ...
Users might be expecting this to error and might be taken aback, as I was when I tried it out.
I experimented with disallowing a variadic `Ts` from being inferred as having two different lengths and that seemed somewhat more intuitive than the above. Opinions appreciated.
# Open questions
(1) What to do about `*Tuple[Any, ...]`?
During the last tensor meeting, we discussed allowing `Tensor[Any, ...]` (and the equivalent `Tensor`) in order to aid gradual typing.
Existing code annotated as `t: Tensor` would treat `Tensor` without parameters as `Tensor[Any, ...]`. That would be a Tensor with arbitrary rank and `Any` as the dimension type. This way, changing `class Tensor` to be a variadic wouldn't immediately break existing code.
I'm yet to implement this, so I'll look into how this affects type inference.
The same goes for `*Iterable[int]`, if indeed that is feasible.
(2) What to do about `*Union[...]`?
If `Ts` is a type variable bound by `tuple`, then `Ts = Union[Tuple[int, str], Tuple[bool]]` is a valid assignment. We then have to consider what unpacking that means.
TypeScript allows this:
When the type argument for T is a union type, the union is spread over
the tuple type. For example, [A, ...T, B] instantiated with X | Y | Z as the type argument for T yields a union of instantiations of [A, ...T, B] with X, Y and Z as the type argument for T respectively.
I'll think about these questions a bit more over the next couple of weeks and update the PEP. We can discuss these in detail during the next tensor typing meeting.
On Mon, Jan 25, 2021 at 9:44 AM Eric Traut email@example.com wrote:
I agree that "variadic" is a term that casual Python coders may be unfamiliar with, but it's a pretty standard term used in other languages, as opposed to "covariant" and "contravariant", which I had never encountered prior to Python. I also don't think variadic type variables will be used by a typical Python coder. It's a pretty advanced feature. Most Python coders don't use simple (non-variadic) type variables today.
Based on your experiments in Pyright so far, how difficult would
introducing the new grammar be?
Introducing the grammar change to allow the star operator within a subscript is easy, just a few dozen lines of new code. The difficult part is with all the error cases this introduces. The star operator is allowed only in the case of variadic type variables. All other uses of a star operator within a subscript are not allowed. A few of these cases can be detected and reported by the parser (e.g. when used in conjunction with slice expressions), but most require semantic information to detect, so the checks will need to be added in many places within the type checker — and presumably the runtime as well. When a new construct introduces many ways to produce new error conditions, my natural instinct is to look for a way to eliminate the possibility of those errors rather than trying to enumerate and plug each of them individually.
The star operator will also require changes beyond the parser and type checker. It will also require updates to completion suggestion and signature help logic.
This is all doable, but it adds significant work across many code bases and will result in many more bugs as we work out all of the kinks and edge cases. I'm not convinced that the readability benefits justify the added complexity. I think naming conventions could work fine here. After all, we've adopted naming conventions to designate covariant and contravariant type variables, and that seems to work fine.
I'm continuing to work on the implementation in pyright (currently on a private branch). Of course, none of this is set in stone — I'm just trying to inform the discussion. Once I get a critical mass of functionality working, I'll merge the changes and give you a chance to play with them in Pyright. I find that it helps to be able to write real code with real tooling when playing with new language constructs.
Here's what I have implemented so far:
- Support for "variadic=True" in TypeVar constructor.
- Support for a variadic TypeVar used at the end of a generic class
- Support for subscripts within type expressions that contain an
arbitrary number of type arguments and matching of those type arguments to type parameters when the last type parameter is a variadic
- Support for "()" (empty tuple) notation when used with variadic TypeVar
- Support for "*args: Ts" matching
- Support for zero-length matching
What I haven't done yet:
- Reporting error for bound, variance, or constraints used in conjunction
with variadic TypeVar
- Reporting errors for situations where a variadic TypeVar is used in
cases where it shouldn't be
- Reporting errors for situations where a variadic TypeVar is not used in
cases where it is needed
- Detecting and reporting errors for variadic TypeVar when it's not at
the end of a list of TypeVars in a generic class declaration
- Detecting and reporting errors for multiple variadic TypeVars appearing
in a generic class declaration
- Support for Union[Ts]
- Support for Tuple[Ts]
- Support for Concatenate[x, y, Ts]
- Variadics in generic type aliases
- Support for open-ended (arbitrary-length) variadics
- Tests for all of the above
I've run across a few additional questions:
- PEP 484 indicates that if a type argument is omitted from a generic
type, that type argument is assumed to be `Any`. What is the assumption with a variadic TypeVar? Should it default to `()` (empty tuple)? If we support open-ended tuples, then we could also opt for `(Any, ...)`.
- What is the type of `def foo(*args: Ts) -> Union[Ts]` if foo is called
with no arguments? In other words, what is the type of `Union[*()]`? Is it `Any`? Is this considered an error?
- When the constraint solver is solving for a variadic type variable,
does it need to solve for the individual elements of the tuple independently? Consider, for example, `def foo(a: Tuple[Ts], b: Tuple[Ts]) -> Tuple[Ts]`. Now, let's consider the expression `foo((3, "hi"), ("hi", 5.6))`? Would this be an error? Or would you expect that the constraint solver produce an answer of `Tuple[int | str, str | float]` (or `Tuple[object, object]`)? It's much easier to implement if we can treat this as an error, but I don't know if that satisfies the use cases you have in mind.
- Along the lines of the previous question, consider the expression
`foo((3, "hi"), ("hi", ))`. In this case, the lengths of the tuples don't match. If we don't support open-ended variadics, this needs to be an error. If we support open-ended variadics, we have the option of solving this as `Tuple[int | str, ...]` (or `Tuple[object, ...]`). Once again, it's easiest if we don't allow this and treat it as an error.
-- Eric Traut Contributor to Pyright and Pylance Microsoft Corp. _______________________________________________ Typing-sig mailing list -- firstname.lastname@example.org To unsubscribe send an email to email@example.com https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: firstname.lastname@example.org
-- S Pradeep Kumar _______________________________________________ Typing-sig mailing list -- email@example.com To unsubscribe send an email to firstname.lastname@example.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: email@example.com