Should a single TypeVar be allowed in a function signature?

Hi all, I was hoping to get some opinions on whether something like the following is valid, and if so, how it should be interpreted: from typing import MutableSequence, TypeVar T = TypeVar('T') def f(seq: MutableSequence[T]) -> int: return len(seq) (This is, of course, *not* inside a generic class parameterized on `T`.) Currently, pytype emits an [invalid-annotation] error for this piece of code, with the explanation that `T` appears only once in `f`. Neither mypy nor pyre emits an error. As far as I know, how a single TypeVar should be treated is not covered in any of the typing PEPs. pytype's reasons for making this an error are: * The error messaging helps users avoid mistakes caused by misunderstanding how to use a TypeVar. * pytype's behavior when encountering a single TypeVar (especially when a bound or constraints come into play) is ill-defined, so we don't want anyone depending on it. The reasons I've seen for not making it an error are: * The TypeVar provides a way to annotate `seq` as a sequence of anything, while still allowing usage errors to be caught in the body of `f`. `Any` swallows all errors, and `object` does not work for invariant containers. * There are potential future uses for a single TypeVar. One that came up in the tensor typing call this morning is using a ListVariadic as a placeholder to represent an unused part of an annotation. * PEP 484 does not explicitly say that this should be an error, so it makes sense to err on the side of allowing it. * In spirit, `T` is an unused variable; flagging such things is not the job of a type checker. Thoughts? Best, Rebecca

This is a timely question. I just implemented a check in pyright for this condition because I found many cases where TypeVars were being used incorrectly within a function signature. I cleaned up a bunch of these cases within our code base thanks to this check. I'm not sure what you mean by "Any swallows all errors" in the context of a `Sequence[Any]`. Presumably, the type checker would still treat that parameter as a Sequence even though it knows nothing about its contents. I don't see how the presence of a TypeVar would change that behavior. Could you elaborate? If we think there is value is allowing a single instance of a TypeVar in a signature _and_ we can agree on a well-defined behavior for this case, then I think it makes sense to allow it. Otherwise, I think it's reasonable to emit an error to tell the user that they are using a language feature in a manner that is undefined. Incidentally, when I implemented this check I noticed that it reported many errors within typeshed stdlib stubs. For example, `_SupportsLessThanT` appears only once in numerous signatures within builtins.pyi. -Eric -- Eric Traut Contributor to pyright & pylance Microsoft Corp.

I think the problem comes up more with invariant types than covariant types (if I remember terms correctly). If the element type doesn't matter, then Sequence[object] is more type-safe than Sequence[Any] because it'll catch if you accidentally access something on the elements (copy/paste mistake, leftover debug code, etc). Since Sequence (and the other read-only iterables iirc) is covariant, you can pass any Sequence in and there isn't really a problem in practice. But, that trick doesn't really work for MutableSequence because it's invariant. Passing [1, 2, 3] ( MutableSequence[int] ) to MutableSequence[object] is a type error. So, you're forced to do MutableSequence[Any] -- but now you've lost the stricter type checking from above. The example given on the pytype issue tracker was: def swap(items: MutableSequence[???], i: int, j: int): items[i], items[j] = items[j], items[i] Which I supposed would generalize to something like, "You can't write a type-safe function that accepts a mutable generic" (a mutable generic should probably be invariant). I suppose you could work around this by using T and returning T, but this again forces you into a bad design decision (returning a modified input is, at the least, questionable, since it can lead to confusing behavior/incorrect assumptions). Or I guess having a private method that does T->T, with a public wrapper that just accepts T. Or sprinkle casting around. But these really seem like playing games to accomplish what seems like something that should be relatively straight forward. FWIW: - I think every time pytype has warned me about an unused T, it was right for some reason. e.g. I copy/pasted something and forgot to fully clean it up/finish it. So I've generally liked the error. - That said, I can't recall any of the cases where the runtime behavior would have been incorrect. I don't often deal with mutable, generic'd inputs, though. Losing this error wouldn't be a big deal. - I don't think a lone T in a signature should be treated as Any because it makes code less type-safe; if you're using a type var, you're clearly going out of your way to be *more* type safe. On Mon, Oct 19, 2020 at 10:14 AM Eric Traut <eric@traut.com> wrote:
This is a timely question. I just implemented a check in pyright for this condition because I found many cases where TypeVars were being used incorrectly within a function signature. I cleaned up a bunch of these cases within our code base thanks to this check.
I'm not sure what you mean by "Any swallows all errors" in the context of a `Sequence[Any]`. Presumably, the type checker would still treat that parameter as a Sequence even though it knows nothing about its contents. I don't see how the presence of a TypeVar would change that behavior. Could you elaborate?
If we think there is value is allowing a single instance of a TypeVar in a signature _and_ we can agree on a well-defined behavior for this case, then I think it makes sense to allow it. Otherwise, I think it's reasonable to emit an error to tell the user that they are using a language feature in a manner that is undefined.
Incidentally, when I implemented this check I noticed that it reported many errors within typeshed stdlib stubs. For example, `_SupportsLessThanT` appears only once in numerous signatures within builtins.pyi.
-Eric
-- Eric Traut Contributor to pyright & pylance Microsoft Corp. _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: richardlev@gmail.com

Would the following two functions be treated any differently by a type checker? I'm struggling to come up with any examples of where a type checker would do anything differently — either when analyzing code that calls these functions or analyzing the code within them. ``` def swap(items: MutableSequence[Any], i: int, j: int): items[i], items[j] = items[j], items[i] _T1 = TypeVar("_T1") def swap(items: MutableSequence[_T1], i: int, j: int): items[i], items[j] = items[j], items[i] ``` There's potentially some value in allowing it for a bound TypeVar, but I wouldn't recommend this pattern because it gives a false sense of security that the type checker will detect type problems when analyzing the function. It can't guarantee that the sequence won't be mutated in a way that violates types. ``` _T2 = TypeVar("_T2", bound=object) def swap(items: MutableSequence[_T2], i: int, j: int): items[i], items[j] = items[j], items[i] items[0] = "an illegal sequence item" ```

I recall seeing various bug reports for mypy where people misunderstand type variables with an upper bound (or is that lower bound? :-) and use them where they should be using Unions. So I think it's useful to reject these. OTOH I have at least a little sympathy for stub authors writing a bunch of stub functions, some of which take two arguments of type SomeT and a few of which take only one such -- it's annoying to have to define a union *and* a typevar and to remember which to use when. (Especially since there is also real danger in writing SomeUnion for two two arguments.) -- --Guido van Rossum (python.org/~guido) Pronouns: he/him (why is my pronoun here?)

I'm not sure what you mean by "Any swallows all errors" in the context of a `Sequence[Any]`.
If you access an element of a container annotated as Sequence[Any], then a type checker will let you do anything you want with that element (access a non-existent attribute, pass it to a method that expects a particular type, etc.). Similarly, the difference between def swap(items: MutableSequence[Any], i: int, j: int): ... and _T1 = TypeVar("_T1") def swap(items: MutableSequence[_T1], i: int, j: int): ... is that in the body of `swap`, something like `items[0].nonexistent_attribute` will not be flagged as an error in the first case but will in the second. So, as I understand it, the argument for using `MutableSequence[_T1]` is that it behaves like a covariant flavor of `MutableSequence[object]`. Best, Rebecca On Mon, Oct 19, 2020 at 1:48 PM Guido van Rossum <guido@python.org> wrote:
I recall seeing various bug reports for mypy where people misunderstand type variables with an upper bound (or is that lower bound? :-) and use them where they should be using Unions. So I think it's useful to reject these.
OTOH I have at least a little sympathy for stub authors writing a bunch of stub functions, some of which take two arguments of type SomeT and a few of which take only one such -- it's annoying to have to define a union *and* a typevar and to remember which to use when. (Especially since there is also real danger in writing SomeUnion for two two arguments.)
-- --Guido van Rossum (python.org/~guido) Pronouns: he/him (why is my pronoun here?) _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: rechen@google.com

Thanks for the explanation. That makes sense. Based on this discussion, we've reversed our decision, and pyright will not enforce that a TypeVar appears more than once in a function signature. -Eric -- Eric Traut Contributor to pyright & pylance Microsoft Corp.

I apologize for reopening this discussion, but I'm considering reversing our earlier decision and adding an error (or at least a warning) in pyright for a TypeVar that appears only once in a function signature. We continue to receive a stream of bug reports filed against pyright that are actually bugs in the user's code based on misuse (and a misunderstanding) of TypeVars in generic functions. Most of these misunderstandings could be avoided if we introduction an error message in this case. Rebecca, I'm curious what decision you made for pytype. Does it still emit an error in this case, or did you decide to remove the check based on this discussion? -Eric -- Eric Traut Contributor to pyright & pylance Microsoft Corp.

Interesting. Can you give (or link to) some examples of the misunderstandings? I’d like to understand what might lead them to that (other than cargo-culting). On Fri, Dec 11, 2020 at 22:27 Eric Traut <eric@traut.com> wrote:
I apologize for reopening this discussion, but I'm considering reversing our earlier decision and adding an error (or at least a warning) in pyright for a TypeVar that appears only once in a function signature.
We continue to receive a stream of bug reports filed against pyright that are actually bugs in the user's code based on misuse (and a misunderstanding) of TypeVars in generic functions. Most of these misunderstandings could be avoided if we introduction an error message in this case.
Rebecca, I'm curious what decision you made for pytype. Does it still emit an error in this case, or did you decide to remove the check based on this discussion?
-Eric
-- Eric Traut Contributor to pyright & pylance Microsoft Corp. _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: guido@python.org
-- --Guido (mobile)

Eric, pytype still emits an error for a TypeVar that appears only once in a function signature. If it helps, misunderstandings I've seen a lot are: - Using a TypeVar instead of a Union (especially common with typing.AnyStr) - Using a TypeVar to mean "subclass of", e.g.: BaseClassVar = TypeVar("BaseClassVar", bound=BaseClass) def f(x: BaseClassVar): ... # the intention is that this means "a subclass of BaseClass but not BaseClass itself". I still have not figured out where this idea came from. - Using a TypeVar in a class without inheriting from Generic. Best, Rebecca On Sat, Dec 12, 2020 at 7:45 AM Guido van Rossum <guido@python.org> wrote:
Interesting. Can you give (or link to) some examples of the misunderstandings? I’d like to understand what might lead them to that (other than cargo-culting).
On Fri, Dec 11, 2020 at 22:27 Eric Traut <eric@traut.com> wrote:
I apologize for reopening this discussion, but I'm considering reversing our earlier decision and adding an error (or at least a warning) in pyright for a TypeVar that appears only once in a function signature.
We continue to receive a stream of bug reports filed against pyright that are actually bugs in the user's code based on misuse (and a misunderstanding) of TypeVars in generic functions. Most of these misunderstandings could be avoided if we introduction an error message in this case.
Rebecca, I'm curious what decision you made for pytype. Does it still emit an error in this case, or did you decide to remove the check based on this discussion?
-Eric
-- Eric Traut Contributor to pyright & pylance Microsoft Corp. _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: guido@python.org
-- --Guido (mobile) _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: rechen@google.com

Thanks Rebecca (and Eric for pointing me privately towards an example in the wild). This is clearly a deep issue, and I'm still not sure what to think of it. I agree that typevars are fairly often mistaken for unions (AnyStr especially), and also for type aliases. We ought to warn users against that. It's very subtle, e.g. I thought that AnyStr would be equivalent to Union[str, bytes], but it's not: ``` def f(a: AnyStr): ... def g(a: Union[str, bytes]): f(a) ``` produces an error in mypy: ``` Value of type variable "AnyStr" of "f" cannot be "Union[str, bytes]" [type-var] ``` Pyright finds no fault with it currently, and I think technically that's better. (Replacing AnyStr with A defined here: ``` A= TypeVar("A", bound=Union[str, bytes]) ``` makes that code valid in mypy too, BTW.) But making users believe that a typevar is like a union or type alias just causes disappointments when they start using *two* typevars: ``` def f(a1: AnyStr, a2: AnyStr): ... f("", b"") # Error! ``` (Pyright doesn't flag this as an error. I don't see how `f("", b"")` would solve the typevar though.) I know I previously argued against strictness here, and I think that technically there is nothing wrong with such code, but it seems we should at least educate users better, maybe at least with a warning? --Guido On Sat, Dec 12, 2020 at 2:51 PM Rebecca Chen <rechen@google.com> wrote:
Eric, pytype still emits an error for a TypeVar that appears only once in a function signature.
If it helps, misunderstandings I've seen a lot are:
- Using a TypeVar instead of a Union (especially common with typing.AnyStr) - Using a TypeVar to mean "subclass of", e.g.:
BaseClassVar = TypeVar("BaseClassVar", bound=BaseClass) def f(x: BaseClassVar): ... # the intention is that this means "a subclass of BaseClass but not BaseClass itself". I still have not figured out where this idea came from.
- Using a TypeVar in a class without inheriting from Generic.
Best, Rebecca
On Sat, Dec 12, 2020 at 7:45 AM Guido van Rossum <guido@python.org> wrote:
Interesting. Can you give (or link to) some examples of the misunderstandings? I’d like to understand what might lead them to that (other than cargo-culting).
On Fri, Dec 11, 2020 at 22:27 Eric Traut <eric@traut.com> wrote:
I apologize for reopening this discussion, but I'm considering reversing our earlier decision and adding an error (or at least a warning) in pyright for a TypeVar that appears only once in a function signature.
We continue to receive a stream of bug reports filed against pyright that are actually bugs in the user's code based on misuse (and a misunderstanding) of TypeVars in generic functions. Most of these misunderstandings could be avoided if we introduction an error message in this case.
Rebecca, I'm curious what decision you made for pytype. Does it still emit an error in this case, or did you decide to remove the check based on this discussion?
-Eric
-- Eric Traut Contributor to pyright & pylance Microsoft Corp. _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: guido@python.org
-- --Guido (mobile) _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: rechen@google.com
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>
participants (4)
-
Eric Traut
-
Guido van Rossum
-
Rebecca Chen
-
Richard Levasseur