On Friday, September 6, 2019, 1:51:35 AM PDT, Steven D'Aprano <steve@pearwood.info> wrote:

> On Thu, Sep 05, 2019 at 05:41:50PM -0700, Andrew Barnert wrote:

>> Are runtime union types actually types, unlike the things in typing,
>> or are they still non-type values that just have special handling as
>> the second argument of isinstance and issubclass and maybe except
>> statements?

> Union, and unions, are currently types:

> py> isinstance(Union, type)
> True

> py> isinstance(Union[int, str], type)
> True

What version of Python are you using here? A python.org 3.7 install on my laptop, a fresh build of master (3.9.0a0) on my laptop, 3.6 on Pythonista, and 3.7 on repl.it (https://repl.it/repls/CriticalDismalProgrammer) all give me `False`. And, looking at the source to typing.py (https://github.com/python/cpython/blob/master/Lib/typing.py#L433) on either master or 3.7, I can't see how it _could_ return `True`.

> and I don't think that should change. I don't think you should be able 
> to instantiate a Union (that's the current behaviour too). A Union of
> two types is not the same as inheriting from both types.

Of course it isn't. That would be the opposite of a union. `Union[int, str]` is a type that can hold any value that's _either_ an int or a str. A type that can hold any value that's _both_ an int and a str would be an intersection. An intersection type would be a subtype of all of its types (but `Intersection[int, str]` still wouldn't be the same thing as `class IntStr(int, str): pass`), but a union type is not a subtype of any of its types.

(Of course there are no values that are both an int and a str, but with more protocol-y types, intersections are useful—plenty of things are both an Iterable and a Container, for example..)

>> I’d expect issubclass(int|str, int|str|bytes) to be true, and
>> issubclass(int|str, int) to be false, not for both of them to raise
>> exceptions about the first argument not being a type.

> Currently, issubclass accepts unions without raising, and that shouldn't
> change either.

Again, what version are you using? Every version I try say it's a TypeError; again see repl.it 3.7 (TepidPowerlessSignature) for an example.

> But I disagree that int|str is a subclass of int|str|bytes. There's no 
> subclass relationship between the two: the (int|str).__bases__ won't
> include (int|str|bytes), 

First, `issubclass` is about subtyping, not about declared inheritance. That's why we not only have, but extensively use, subclass hooks to provide subclass relationships entirely based on structure (like `Iterable`) or on registration (like `Sequence`):

    >>> issubclass(list, collections.abc.Sequence)
    True
    >>> issubclass(list, collections.abc.Iterable)
    True
    >>> list.__bases__
    (object,)

If `issubclass` were about inheritance rather than subtyping, `list` would have to inherit from a half dozen types in `collections.abc`, plus a half-dozen near-identical types in `typing` plus at least one more. Fortunately, it doesn't have to inherit from any of them, and doesn't, but Python can still recognize that it's a subtype of all of them.

And this is fundamental to the design of the static typing system, just as it is the dynamic typing system. If you declare a function to take a `typing.Sequence` argument, and you pass it a `list` value, it type-checks successfully.

So, static type checkers consider `Union[int, str]` to be a subtype of `Union[int, str, bytes]`. As they should. The dynamic checker, `issubclass`, currently refuses to do that test (except maybe on your machine?). But if it does do the test, why shouldn't it give the same answer as the static checker? (Of course there _are_ cases where the dynamic and static type systems should, or even must, use different rules, but all those cases are for some specific reason. If you think there is such a reason here, you should be able to say what it is.)

> and instances of int|str don't inherit from 
> all three of int, str, bytes.

And again, you're confusing intersection and union. 

In fact, `int|str` doesn't inherit from _any_ of those three types—and. more importantly, it isn't a subtype of any of them. If `int|str` were a subtype of `int`, that would mean that every valid `int|str` value was also a valid `int` values. Which obviously isn't true; `"abc"` is an `int|str`, and it isn't an `int`.

But `int|str` _is_ a subtype of `int|str|bytes`. Every `int|str` value is an `int|str|bytes` value.

You can double-check with the applicability rule of thumb: If I have some value of type `int|str` and I try to use it with some code that requires an `int`, can it raise a `TypeError`? Yes; if the value is `"abc"`. What if I try to use it with some code that requires an `int|str|bytes`? No, it cannot raise a `TypeError`. You can do further checks with the LSP and other rules of thumb if you want, but they all point the same way (as, again, the static type system already recognizes).

Or, think about types as sets of values. The type `int` is literally just the infinite set of all possible `int` values. The type `int|str` is the union of the two sets `int` and `str`. The type `int|str|bytes` is the union of the three sets `int`, `str`, and `bytes`. And `issubclass` is just the subset relationship. So, is `int U str` a subset of `int U str U bytes`? Of course it is.

>> Finally, do we still need the existing Generic, typing.Union, at all?

> typing.Union will probably just become an alias to the (proposed)
> built-in types.Union.

If `typing.Union` and `typing.Union[int, str]` actually are types as you say, then there's probably no harm in replacing the existing `typing.Union` with the new `types.UnionType`. But if they aren't, as testing and reading the code seems to show, then I'm less sure that it's harmless. Which is why I asked.


Code of Conduct: