On Wed, Dec 24, 2014 at 4:50 PM, Eugene Toder <eltoder@gmail.com> wrote:
Guido van Rossum <guido@...> writes:
> [...] .https://quip.com/r69HA9GhGa7J

(I apologize in advance if some of this was covered in previous
discussions).

No problem. :-) I apologize for reformatting the text I am quoting from you, it looked as if it was sent through two different line clipping functions.
 
1. Since there's the Union type, it's also natural to have the Intersection
type. A class is a subclass of Intersection[t1, t2, ...] if it's a subclass
of all t1, t2 etc. The are 2 main uses of the Intersection type:

a) Require that an argument implements multiple interfaces:

class Foo:
    @abstractmethod
    def foo(self): ...

class Bar:
    @abstractmethod
    def bar(self): ...

def fooItWithABar(obj: Intersection[Foo, Bar]): ...

Yes, we even have an issue to track this proposal. I don't recall who suggested it first. I don't know if it poses any problems to the static checked (though I doubt it). https://github.com/ambv/typehinting/issues/18
 
b) Write the type of an overloaded function:

@overload
def foo(x: str) -> str: ...
@overload
def foo(x: bytes) -> bytes: ...

foo # type: Intersection[Callable[[str], str], Callable[[bytes], bytes]]

The static checker can figure that out for itself, but that doesn't mean we necessarily need a way to spell it.
 
2. Constrained type variables (Var('Y', t1, t2, ...)) have a very unusual
behavior.

a) "subclasses of t1 etc. are replaced by the most-derived base class among t1etc."
This defeats the very common use of constrained type variables: have a type
preserving function limited to classes inherited from a common base. E.g. say
we have a function:

def relocate(e: Employee) -> Employee: ...

The function actually always returns an object of the same type as the
argument, so we want to write a more precise type. We usually do it like
this:

XEmployee = Var('XEmployee', Employee)

def relocate(e: XEmployee) -> XEmployee: ...

This won't work with the definition from the proposal.

I just copied this from mypy (where it is called typevar()). I guess in that example one would use an *unconstrained* type variable. The use case for the behavior I described is AnyStr -- if I have a function like this I don't want the type checker to assume the more precise type:

def space_for(s: AnyStr) -> AnyStr:
    if isinstance(s, str): return ' '
    else: return b' '
 
If someone defined a class MyStr(str), we don't want the type checker to think that space_for(MyStr(...)) returns a MyStr instance, and it would be impossible for the function to even create an instance of the proper subclass of str (it can get the class object, but it can't know the constructor signature).

For strings, functions like this (which return some new string of the same type as the argument, constrained to either str or bytes) are certainly common. And for your Employee example it would also seem problematic for the function to know how to construct an instance of the proper (dynamically known) subclass.

b) Multiple constraints introduce an implicit Union. I'd argue that type
variables are more commonly constrained by an Intersection rather than a
Union. So it will be more useful if given this definition Y has to be
compatible with all of t1, t2 etc, rather than just one of them.
Alternatively, this can be always spelled out explicitly:
    Y1 = Var('Y1', Union[t1, t2, ...])
    Y2 = Var('Y2', Intersection[t1, t2, ...])

Well, maybe. At this point you'd have to point us to a large body of evidence -- mypy has done well so far with its current definition of typevar().

OTOH one of the ideas on the table is to add keyword options to Var(), which might make it possible to have type variables with different semantics. There are other use cases, some of which are discussed in the tracker: https://github.com/ambv/typehinting/issues/18
 
Pragmatics:

3. The names Union and Intersection are standard terminology in type checking,
but may not be familiar to many Python users. Names like AnyOf[] and AllOf[]
can be more intuitive.

I strongly disagree with this. Python's predecessor, ABC, used a number of non-standard terms for common programming language concepts, for similar reasons. But the net effect was just that it looked weird to anyone familiar with other languages, and for the users who were a completely blank slate, well, "HOW-TO" was just as much jargon that they had to learn as "procedure". Also, the Python users who will most likely need to learn about this stuff are most likely library developers.
 
4. Similar to allowing None to mean type(None) it's nice to have shortcuts
like:
    (t1, t2, ...) == Tuple[t1, t2, ...]
    [t1] == List[t1]
    {t1: t2} == Dict[t1, t2]
    {t1} == Set[t1]

The last 3 can be Sequence[t1], Mapping[t1, t2] and collections.Set[t1] if we
want to encourage the use of abstract types.

This was proposed as the primary notation during the previous round of discussions here. You are right that if we propose to "fix up" type annotations that appear together with a default value we should also be able in principle to change these shortcuts into the proper generic type objects. Yet I am hesitant to adopt the suggestion -- people may already be using e.g. dictionaries as annotations for some other purpose, and there is the question you bring up whether we should promote these to concrete or abstract collection types.

Also, I should note that, while I mentioned it as a possibility, I am hesitant to endorse the shortcut of "arg: t1 = None" as a shorthand for "arg: Union[t1, None] = None" because it's unclear whether runtime introspection of the __annotations__ object should return t1 or the inferred Union object. (The unspoken requirement here is that there will be no changes to CPython's handling of annotations -- the typing.py module will be all that is needed, and it can be backported to older Python versions.)
 
5. Using strings for forward references can be messy in case of generics:
parsing of brackets etc in the string will be needed. I propose explicit
forward declarations:

C = Declare('C')
class C(Generic[X]):
    def foo(self, other: C[X]): ...
    def bar(self, other: C[Y]): ...

Agreed this is an area that needs more thought. In mypy you can actually write the entire annotation in string quotes -- mypy has to be able to parse type expressions anyway (in fact it has to be able to parse all of Python :-). I do think that the example you present feels rather obscure.
 
6. On the other hand, using strings for unconstrained type variables is quite
handy, and doesn't share the same problem:

def head(xs: List['T']) -> 'T': ...

Yeah, it does look quite handy, if the ambiguity with forward references can be resolved. Also it's no big deal to have to declare a type variable -- you can reuse them for all subsequent function definitions, and you usually don't need more than two or three.

--
--Guido van Rossum (python.org/~guido)