[Typing-sig] Re: Review of PEP 646 (Variadic Generics)

25 Jan 2021


      Replying to Matthew's question:
...
(**Jukka, Ivan, Rebecca**, it would be super useful to hear your thoughts
here too. In case a catch-up would be helpful: the question is, if we
wanted to make it so that our new 'variadic type variable' was used as `Ts
= TypeVarTuple('Ts'); class Foo(Generic[*Ts]): ...; foo: Foo[int, str] =
Foo()`, how hard would that be, considering the new use of the star?)
For pytype in particular, this wouldn't be too bad. We use a Python
interpreter in the target Python version to compile source code to bytecode
and then analyze the bytecode, so grammar changes aren't any harder to
accommodate than other types of changes. (pytype does have a separate stub
parser that uses typed_ast at the moment, and we need to instead use the
ast module in Python 3.8+ to parse newer syntax, but that's a change we
need to make anyway and not specific to this PEP.)

On Mon, Jan 25, 2021 at 10:58 AM S Pradeep Kumar  wrote:
...
...
Can you clarify what "no concatenation of variadics" refers to? Does
this mean we can't (yet) have `Tuple[int, *Ts]`? Or is that specifically
about `Tuple[*Ts1, *Ts2]`. (And what about the same constructs inside
`Callable[[<here>], R]`?
Guido: I mean concatenation of multiple variadics (`Tuple[*Ts, *Ts2]`,
same within Callable).
The first PEP will support concatenation with a non-variadic prefix and
suffix (`Tuple[int, *Ts, str]`, same within Callable).
...
But I've been learning about the corresponding functionality in
TypeScript, and it uses `...X`, here which is JavaScript's and TypeScript's
spelling of `*X` in various places. I also find it the `*` useful hint for
the user: When I see `Tuple[*X]` I immediately know that the length of the
tuple is variadic. If I were to see `Tuple[Ts]` I'd have to remember that
the naming convention tells me that the tuple is variadic. Readability
Counts.
I strongly agree. `*Ts` is a clear visual analogy to tuple unpacking.
# Reference implementation and examples
...
I wonder if any of the PEP authors have worked on an implementation yet
Yes, I have been working on a reference implementation from scratch in
Pyre.
I added support over the weekend for concatenation of multiple variadics
since that's one of the main points of uncertainty in our discussion. Here
are some examples that typecheck on my branch:
(1) Basic usage of non-variadic prefix and suffix
```
from typing import List, Tuple, TypeVar
Ts = TypeVar("Ts", bound=tuple)
Ts2 = TypeVar("Ts2", bound=tuple)
def strip_both_sides(x: Tuple[int, *Ts, str]) -> Ts: ...
def add_int(x: Tuple[*Ts]) -> Tuple[bool, *Ts, int]: ...
def foo(xs: Tuple[int, bool, str]) -> None:
    z = add_bool_int(strip_both_sides(xs))
# => Tuple[bool, bool, int]
    reveal_type(z)
```
(2) `partial` (using concatenation of multiple variadics):
```
from typing import Callable, Tuple, TypeVar
Ts = TypeVar("Ts", bound=tuple)
Ts2 = TypeVar("Ts2", bound=tuple)
def partial(f: Callable[[*Ts, *Ts2], bool], *args: *Ts) ->
Callable[[*Ts2], bool]: ...
def foo(x: int, y: str, z: bool) -> bool: ...
def baz() -> None:
    expects_bool = partial(foo, 1, "hello")
# Valid.
    expects_bool(True)
# Error.
    expects_bool()
# Note that this `partial` doesn't support keyword arguments.
```
Other notable features:
+ Variadic classes like Tensor.
+ `*args: *Ts`
+ Concatenation within parameters of Tuples, Callables, and variadic
classes.
+ Allowing Tuples to be arbitrarily unpacked: `*Tuple[int, *Ts]`
Other test cases:
https://github.com/pradeep90/pyre-check/blob/master/source/analysis/test/int...
Once we settle the debate about TypeVar vs TypeVarTuple and other syntax,
I can work on merging this branch into Pyre master and implementing some
other details from the PEP (such as `Union[*Ts]`, generic aliases, Map with
generic aliases, etc.).
# Key findings
(1) Prefix and suffix work as expected
Handling a suffix of non-variadic parameters `Tuple[*Ts, int]` is
essentially the same work as handling a prefix `Tuple[int, *Ts]`.
@Eric Could you share an example of a non-variadic suffix that you felt
would be hard to tackle?
I'm basically following the TypeScript algorithm for inferring one
variadic tuple type against another (
https://github.com/microsoft/TypeScript/pull/39094).
(2) Concatenation of multiple variadics
Concatenation of multiple variadics works as expected when the length of
one is unambiguously known.
However, concatenation of multiple variadics is indeed significantly
complex and should definitely *not* be part of the initial variadics PEP.
(3) Unpack[Ts] / *Ts didn't pose a big problem
I preprocessed `*Ts` to be `Unpack[Ts]` so that I could uniformly deal
with `Unpack`. Pyre uses its own parser, so that's one thing to note.
Also, the `*` syntax is going to be implemented as part of PEP 637, so I
don't think the implementation should be a concern for us. Until it is
implemented, we do have `Unpack`.
(4) Surprising behavior: variance of Tuple vs Tensor
```
def foo(xs: Tuple[*Ts], ys: Tuple[*Ts]) -> Tuple[*Ts]: ...
tuple_int_str: Tuple[int, str]
tuple_int_bool: Tuple[int, bool]
foo(tuple_int_str, tuple_int_bool)
```
We might expect a type error here because `Tuple[int, str] != Tuple[int,
bool]`. However, the typechecker actually infers `Ts = Tuple[int,
Union[str, bool]]`, which is a perfectly valid solution. This is sound but
unintuitive.
This is analogous to what we do for the non-variadic case:
```
def foo(xs: T, ys: T) -> T: ...
some_int: int
some_str: str
foo(some_int, some_bool)
```
This is not a type error because `T = Union[int, str]` is a valid solution
for T. (Mypy infers `T = object`, but it's the same principle.)
Note, however, that we *do* raise an error with Tensor:
```
def bar(x: Tensor[*Ts], y: Tensor[*Ts]) -> Tensor[*Ts]: ...
xs: Tensor[int, str]
ys: Tensor[bool, bool]
# Error
bar(xs, ys)
```
This is because Tensor is invariant whereas Tuple is covariant.
(FWIW, TypeScript doesn't do the above for either the variadic or
non-variadic case. It raises an error if T is passed different types of
arguments.)
The Tuple covariance means we don't error in cases where we would expect
to. Perhaps we can define variadic tuples as invariant by default.
(5) What if there is an arity mismatch?
Consider the following case (the same case as Eric pointed out :) ).
```
def foo(xs: Tuple[*Ts], ys: Tuple[*Ts]) -> Tuple[*Ts]: ...
tuple_int_str: Tuple[int, str]
tuple_bool: Tuple[bool]
foo(tuple_int_str, tuple_bool)
```
We might expect this to error because the argument types have different
lengths.
However, Ts = Union[Tuple[int, str], Tuple[bool]] is a valid solution,
since `Tuple` is covariant.
`foo` gets treated as:
```
def foo(xs: Union[Tuple[int, str], Tuple[bool]], ys: Union[Tuple[int,
str], Tuple[bool]]) -> Union[Tuple[int, str], Tuple[bool]]: ...
```
Users might be expecting this to error and might be taken aback, as I was
when I tried it out.
I experimented with disallowing a variadic `Ts` from being inferred as
having two different lengths and that seemed somewhat more intuitive than
the above. Opinions appreciated.
# Open questions
(1) What to do about `*Tuple[Any, ...]`?
During the last tensor meeting, we discussed allowing `Tensor[Any, ...]`
(and the equivalent `Tensor`) in order to aid gradual typing.
Existing code annotated as `t: Tensor` would treat `Tensor` without
parameters as `Tensor[Any, ...]`. That would be a Tensor with arbitrary
rank and `Any` as the dimension type. This way, changing `class Tensor` to
be a variadic wouldn't immediately break existing code.
I'm yet to implement this, so I'll look into how this affects type
inference.
The same goes for `*Iterable[int]`, if indeed that is feasible.
(2) What to do about `*Union[...]`?
If `Ts` is a type variable bound by `tuple`, then `Ts = Union[Tuple[int,
str], Tuple[bool]]` is a valid assignment. We then have to consider what
unpacking that means.
TypeScript allows this:
...
When the type argument for T is a union type, the union is spread over
the tuple type. For example, [A, ...T, B] instantiated with X | Y | Z as
the type argument for T yields a union of instantiations of [A, ...T, B]
with X, Y and Z as the type argument for T respectively.
******
I'll think about these questions a bit more over the next couple of weeks
and update the PEP. We can discuss these in detail during the next tensor
typing meeting.
Best,
On Mon, Jan 25, 2021 at 9:44 AM Eric Traut  wrote:
...
I agree that "variadic" is a term that casual Python coders may be
unfamiliar with, but it's a pretty standard term used in other languages,
as opposed to "covariant" and "contravariant", which I had never
encountered prior to Python. I also don't think variadic type variables
will be used by a typical Python coder. It's a pretty advanced feature.
Most Python coders don't use simple (non-variadic) type variables today.
...
Based on your experiments in Pyright so far, how difficult would
introducing the new grammar be?
Introducing the grammar change to allow the star operator within a
subscript is easy, just a few dozen lines of new code. The difficult part
is with all the error cases this introduces. The star operator is allowed
only in the case of variadic type variables. All other uses of a star
operator within a subscript are not allowed. A few of these cases can be
detected and reported by the parser (e.g. when used in conjunction with
slice expressions), but most require semantic information to detect, so the
checks will need to be added in many places within the type checker — and
presumably the runtime as well. When a new construct introduces many ways
to produce new error conditions, my natural instinct is to look for a way
to eliminate the possibility of those errors rather than trying to
enumerate and plug each of them individually.
The star operator will also require changes beyond the parser and type
checker. It will also require updates to completion suggestion and
signature help logic.
This is all doable, but it adds significant work across many code bases
and will result in many more bugs as we work out all of the kinks and edge
cases. I'm not convinced that the readability benefits justify the added
complexity. I think naming conventions could work fine here. After all,
we've adopted naming conventions to designate covariant and contravariant
type variables, and that seems to work fine.
I'm continuing to work on the implementation in pyright (currently on a
private branch). Of course, none of this is set in stone — I'm just trying
to inform the discussion. Once I get a critical mass of functionality
working, I'll merge the changes and give you a chance to play with them in
Pyright. I find that it helps to be able to write real code with real
tooling when playing with new language constructs.
Here's what I have implemented so far:
* Support for "variadic=True" in TypeVar constructor.
* Support for a variadic TypeVar used at the end of a generic class
declaration
* Support for subscripts within type expressions that contain an
arbitrary number of type arguments and matching of those type arguments to
type parameters when the last type parameter is a variadic
* Support for "()" (empty tuple) notation when used with variadic TypeVar
* Support for "*args: Ts" matching
* Support for zero-length matching
What I haven't done yet:
* Reporting error for bound, variance, or constraints used in conjunction
with variadic TypeVar
* Reporting errors for situations where a variadic TypeVar is used in
cases where it shouldn't be
* Reporting errors for situations where a variadic TypeVar is not used in
cases where it is needed
* Detecting and reporting errors for variadic TypeVar when it's not at
the end of a list of TypeVars in a generic class declaration
* Detecting and reporting errors for multiple variadic TypeVars appearing
in a generic class declaration
* Support for Union[Ts]
* Support for Tuple[Ts]
* Support for Concatenate[x, y, Ts]
* Variadics in generic type aliases
* Support for open-ended (arbitrary-length) variadics
* Tests for all of the above
I've run across a few additional questions:
1. PEP 484 indicates that if a type argument is omitted from a generic
type, that type argument is assumed to be `Any`. What is the assumption
with a variadic TypeVar? Should it default to `()` (empty tuple)? If we
support open-ended tuples, then we could also opt for `(Any, ...)`.
2. What is the type of `def foo(*args: Ts) -> Union[Ts]` if foo is called
with no arguments? In other words, what is the type of `Union[*()]`? Is it
`Any`? Is this considered an error?
3. When the constraint solver is solving for a variadic type variable,
does it need to solve for the individual elements of the tuple
independently? Consider, for example, `def foo(a: Tuple[Ts], b: Tuple[Ts])
-> Tuple[Ts]`. Now, let's consider the expression `foo((3, "hi"), ("hi",
5.6))`? Would this be an error? Or would you expect that the constraint
solver produce an answer of `Tuple[int | str, str | float]` (or
`Tuple[object, object]`)? It's much easier to implement if we can treat
this as an error, but I don't know if that satisfies the use cases you have
in mind.
4. Along the lines of the previous question, consider the expression
`foo((3, "hi"), ("hi", ))`. In this case, the lengths of the tuples don't
match. If we don't support open-ended variadics, this needs to be an error.
If we support open-ended variadics, we have the option of solving this as
`Tuple[int | str, ...]` (or `Tuple[object, ...]`). Once again, it's easiest
if we don't allow this and treat it as an error.
--
Eric Traut
Contributor to Pyright and Pylance
Microsoft Corp.
_______________________________________________
Typing-sig mailing list -- typing-sig@python.org
To unsubscribe send an email to typing-sig-leave@python.org
https://mail.python.org/mailman3/lists/typing-sig.python.org/
Member address: gohanpra@gmail.com
--
S Pradeep Kumar
_______________________________________________
Typing-sig mailing list -- typing-sig@python.org
To unsubscribe send an email to typing-sig-leave@python.org
https://mail.python.org/mailman3/lists/typing-sig.python.org/
Member address: rechen@google.com

[Typing-sig] Re: Review of PEP 646 (Variadic Generics)

Rebecca Chen