Would it make sense to move away from making `Tuple` mean inhomogeneous in Python?

For example lists can be inhomogeneous, except in typed Python.

    my_list = [1, 'a', None]
    reveal_type(my_list)  # builtins.list[builtins.object*]

    my_tuple = (1, 'a', None)
    reveal_type(my_tuple)  # Tuple[builtins.int, builtins.str, None]

Then when I see `args_to_lists` in PEP 646 I'm wondering how I'm meant to build this tuple so a type checker doesn't error. Currently both map and list comprehensions error.

    def args_to_lists(args: Tuple[int, str]) -> Tuple[List[int], List[str]]:
        return tuple(map(lambda a: [a], args))  # Incompatible return value type (got "Tuple[List[object], ...]", expected "Tuple[List[int], List[str]]")

    def args_to_lists(args: Tuple[int, str]) -> Tuple[List[int], List[str]]:
        return tuple([[a] for a in args])       # Incompatible return value type (got "Tuple[List[object], ...]", expected "Tuple[List[int], List[str]]")

This leaves me with two questions:

-   What should `map`'s type be to be able to match `Map`?
-   Is this going to promote calling `tuple` just to silence the type checker?

Could adding an `Inhomogeneous` / `Heterogeneous` 'type', in the future, be a sensible solution?

    my_list = [1, 'a', None]
    reveal_type(my_list)  # builtins.list[typing.Inhomogeneous[builtins.int, builtins.str, None]]

    my_tuple = (1, 'a', None)
    reveal_type(my_tuple)  # Tuple[builtins.int, builtins.str, None]
                           # This could be sugar for
                           # Tuple[Inhomogeneous[int, str, None]]

    def args_to_lists(args: Tuple[int, str]) -> Iterator[Inhomogeneous[List[int], List[str]]]:
        return map(lambda a: [a], args)

    def args_to_lists(*args: *Ts) -> Iterator[Map[List, Ts]]:
        return map(lambda a: [a], args)

    def args_to_lists(args: Tuple[int, str]) -> List[Inhomogeneous[List[int], List[str]]]:
        return [[a] for a in args]

    def args_to_lists(*args: *Ts) -> List[Map[List, Ts]]:
        return [[a] for a in args]

    def args_to_lists(*args: *Ts) -> Map[List, Ts]:  # Could assume `Tuple[Map[...]]` if no type is specified.
        return tuple([[a] for a in args])

To my naive eyes this fits and also shows how `TypeVar(bound=Tuple)` may not make sense in the future. Since would `List[Ts]]` mean `List[List[int], List[str]]` or `List[Tuple[List[int], List[str]]]`; it'd be strange to mean the former, but then how could you get the former? However suppose we decide to add `Inhomogeneous` in the future, what would stop us from just switching over to using `TypeVar(bound=Inhomogeneous)`?

Additionally will the name `TypeVarTuple` lead to people conflating inhomogeneity with tuples? At this time it makes sense because tuple is the only inhomogeneous type in Python. But will this always be the case? If not would it make sense to add a `TypeVarX` for the new type(s), `X`? Or will we only have `TypeVarTuple` for all inhomogeneous types?

Note: To be clear I don't think `Inhomogeneous` should be added to PEP 646. It would add a lot of complexity around mutability. How would `list.remove` work? Should `list.remove` result in a type error? How could you specify which methods work with `Inhomogeneous`, etc.

On 12/01/2021 21:00, Matthew Rahtz via Typing-sig wrote:
**Warning: long stream-of-consciousness email. Tl; dr: `TypeVar(bound=Tuple)` is extremely elegant, but might limit our options later on because of how the variance of `Tuple` works. I'm still leaning towards `TypeVarTuple`, but only just.**

Guido had a really interesting suggestion in the last tensor typing meeting: what if, instead of creating a new constructor `TypeVarTuple`, we just did `TypeVar(bound=Tuple)`?

And reordering some examples in the PEP, it's just fully hit me that this would actually make a whole bunch of sense. The simplest example I could come up with to introduce usage of `TypeVarTuple` was:

def identity(x: Ts) -> Ts: ...
x: Tuple[int, str]
y = identity(x)

But I was debating with myself: "OK, but we're going to have to be explicit about this being an example for illustrative purposes only - because otherwise, readers are going to be wondering, 'Why not just use a regular `TypeVar` for `Ts`?'"

And that's the whole point! `Ts` probably _should_ just be a regular `TypeVar`. The only thing special about the example is that we're assuming it's a `TypeVar` that definitely does contain some other types, such that we can potentially use `Unpack` or `Map` later on.

Looking back through the conversation we had about this last time, I think the main doubt we had about whether reusing a regular `TypeVar` was that it could be confusing if the variadic nature of a particular instance was only _implied_ rather than marked specifically. But if we _are_ being explicit by using `bound=Tuple`, I personally feel a lot better about it.

Thinking out loud: one consideration is how this would interact with other arguments to `TypeVar`. For example, suppose we wanted to set up `Tensor` so that things worked like this:

class Tensor(Generic[*Shape]): ...

class Batch: pass
class TrainBatch(Batch): pass
class TestBatch(Batch): pass
class Time: pass

def only_accepts_batch(t: Tensor[Batch]): ...

t1: Tensor[TrainBatch]
only_accepts_batch(t1)  # Valid
t2: Tensor[Time]
only_accepts_batch(t2)  # Error

This would work if `Shape` was somehow set up to be covariant in a way such that since `TrainBatch` is a subclass of `Batch`, `Tensor[TrainBatch]` would be considered a subclass of `Tensor[Batch]`).

How would this work if we reused `TypeVar`? It seems like it wouldn't work: if we did `Shape = TypeVar('Shape', bound=Tuple, covariant=True)`...well, the question is: is `Tuple[TrainBatch]` a subclass of `Tuple[Batch]`? That is, is `Tuple` covariant?

Hmm, I actually can't find solid information on this in PEP 483 or 484. https://github.com/python/typing/issues/2 suggests the answer is 'yes', but I guess this is only true in the `Tuple[SomeType, ...]` form; a `Tuple[Child]` doesn't seem like it should be automatically be a subclass of `Tuple[Parent]`.

We _could_ set this up so that, if a variadic type variable is unpacked, the types we check aren't `Tuple[Batch]` and `Tuple[TrainBatch]` but `Batch` and `TrainBatch` directly. But that would create an inconsistency: this wouldn't be possible if, for some reason, the user wishes to use the variadic type variable without unpacking - a use-case that feels like it should have equal rights to using a variadic type variable _with_ unpacking.

Another option would be to change the behaviour of `covariant=True` and `contravariant=True` when `bound=Tuple`. That seems less than ideal; it might not be backwards-compatible.

A third option would be to introduce an extra argument to `TypeVar` which explicitly changed the behaviour - e.g. `tuplevariance=True`. But it would have to only be valid with `bound=Tuple`, so in that case we may as well just go back to `TypeVar('Shape', tuple=True)`.

So far this was all about variance. But also, what about `bound`? What if we wanted to do:

class Batch: pass

BatchShape = TypeVar('BatchShape', bound=Tuple[Batch])

class BatchTensor(Generic[BatchShape]): ...

t1: BatchTensor[Batch]  # Valid
t2: Batchtensor[Time]  # Error

I guess this one would also hinge on whether `Tuple[TrainBatch]` were considered a subclass of `Tuple[Batch]`. Hmm.

To zoom out, though: one the other hand, we could also argue, "Let's not tie ourselves in knots about hypothetical future features. Let's choose the approach in the present which seems simplest and most elegant. Let's not overcomplicate the solution for the sake of all the things we might hypothetically want to do in the future."

I guess the crux of that debate would be: how likely is it that the features I've sketched above are going to be ones we want? I'm not sure about that yet...

So far I'm still leaning slightly towards creating a new constructor, in order to leave our options open later on. But I do feel mighty conflicted - the elegance of `TypeVar(bound=Tuple)` is undeniable. I'll mull this over a bit more.


On Fri, 1 Jan 2021 at 15:58, David Foster <davidfstr@gmail.com> wrote:
On 12/29/20 12:24 PM, S Pradeep Kumar wrote:
> @Alfonso:
> That seems like an interesting and feasible idea!
> Overall, I'd like to defer this for now since it would be fully
> backward-compatible. It probably belongs in the future type arithmetic PEP.

Agreed. The existing PEP is already fairly complex.

On 12/29/20 12:24 PM, S Pradeep Kumar wrote:
>  > One significant change I've (tentatively) made is renaming `Expand`
> to `Unpack`, to reflect the terminology we use with regular tuples. I'm
> surprised no one else has suggested this, so I might be missing
> something - are there any arguments against calling what we're doing
> 'unpacking'?
> I think that's reasonable.

It looks like there is precedent for use of the term "unpacking" in the
existing Python documentation:

- "iterable unpacking", when talking about *args
        > https://docs.python.org/3/reference/expressions.html#expression-lists
- "unpacking argument lists", when talking about *args
- "dictionary unpacking" when talking about **kwargs
        > https://docs.python.org/3/reference/expressions.html#dictionary-displays

So even though I personally still like Expand, I agree that Unpack would
probably be more consistent with existing documentation (and search
engine keywords).

David Foster | Seattle, WA, USA
Contributor to TypedDict support for mypy
Typing-sig mailing list -- typing-sig@python.org
To unsubscribe send an email to typing-sig-leave@python.org
Member address: mrahtz@google.com

Typing-sig mailing list -- typing-sig@python.org
To unsubscribe send an email to typing-sig-leave@python.org
Member address: peilonrayz@gmail.com