[Python-ideas] Type Hinting Kick-off

Guido van Rossum guido at python.org
Thu Dec 25 05:16:52 CET 2014


On Wed, Dec 24, 2014 at 4:50 PM, Eugene Toder <eltoder at gmail.com> wrote:

> Guido van Rossum <guido at ...> writes:
> > [...] .https://quip.com/r69HA9GhGa7J
>
> (I apologize in advance if some of this was covered in previous
> discussions).
>

No problem. :-) I apologize for reformatting the text I am quoting from
you, it looked as if it was sent through two different line clipping
functions.


> 1. Since there's the Union type, it's also natural to have the Intersection
> type. A class is a subclass of Intersection[t1, t2, ...] if it's a subclass
> of all t1, t2 etc. The are 2 main uses of the Intersection type:
>
> a) Require that an argument implements multiple interfaces:
>
> class Foo:
>     @abstractmethod
>     def foo(self): ...
>
> class Bar:
>     @abstractmethod
>     def bar(self): ...
>
> def fooItWithABar(obj: Intersection[Foo, Bar]): ...
>

Yes, we even have an issue to track this proposal. I don't recall who
suggested it first. I don't know if it poses any problems to the static
checked (though I doubt it). https://github.com/ambv/typehinting/issues/18


> b) Write the type of an overloaded function:
>
> @overload
> def foo(x: str) -> str: ...
> @overload
> def foo(x: bytes) -> bytes: ...
>
> foo # type: Intersection[Callable[[str], str], Callable[[bytes], bytes]]
>

The static checker can figure that out for itself, but that doesn't mean we
necessarily need a way to spell it.


> 2. Constrained type variables (Var('Y', t1, t2, ...)) have a very unusual
> behavior.
>
> a) "subclasses of t1 etc. are replaced by the most-derived base class
> among t1etc."
> This defeats the very common use of constrained type variables: have a type
> preserving function limited to classes inherited from a common base. E.g.
> say
> we have a function:
>
> def relocate(e: Employee) -> Employee: ...
>
> The function actually always returns an object of the same type as the
> argument, so we want to write a more precise type. We usually do it like
> this:
>
> XEmployee = Var('XEmployee', Employee)
>
> def relocate(e: XEmployee) -> XEmployee: ...
>
> This won't work with the definition from the proposal.
>

I just copied this from mypy (where it is called typevar()). I guess in
that example one would use an *unconstrained* type variable. The use case
for the behavior I described is AnyStr -- if I have a function like this I
don't want the type checker to assume the more precise type:

def space_for(s: AnyStr) -> AnyStr:
    if isinstance(s, str): return ' '
    else: return b' '

If someone defined a class MyStr(str), we don't want the type checker to
think that space_for(MyStr(...)) returns a MyStr instance, and it would be
impossible for the function to even create an instance of the proper
subclass of str (it can get the class object, but it can't know the
constructor signature).

For strings, functions like this (which return some new string of the same
type as the argument, constrained to either str or bytes) are certainly
common. And for your Employee example it would also seem problematic for
the function to know how to construct an instance of the proper
(dynamically known) subclass.

b) Multiple constraints introduce an implicit Union. I'd argue that type
> variables are more commonly constrained by an Intersection rather than a
> Union. So it will be more useful if given this definition Y has to be
> compatible with all of t1, t2 etc, rather than just one of them.
> Alternatively, this can be always spelled out explicitly:
>     Y1 = Var('Y1', Union[t1, t2, ...])
>     Y2 = Var('Y2', Intersection[t1, t2, ...])
>

Well, maybe. At this point you'd have to point us to a large body of
evidence -- mypy has done well so far with its current definition of
typevar().

OTOH one of the ideas on the table is to add keyword options to Var(),
which might make it possible to have type variables with different
semantics. There are other use cases, some of which are discussed in the
tracker: https://github.com/ambv/typehinting/issues/18


> Pragmatics:
>
> 3. The names Union and Intersection are standard terminology in type
> checking,
> but may not be familiar to many Python users. Names like AnyOf[] and
> AllOf[]
> can be more intuitive.
>

I strongly disagree with this. Python's predecessor, ABC, used a number of
non-standard terms for common programming language concepts, for similar
reasons. But the net effect was just that it looked weird to anyone
familiar with other languages, and for the users who were a completely
blank slate, well, "HOW-TO" was just as much jargon that they had to learn
as "procedure". Also, the Python users who will most likely need to learn
about this stuff are most likely library developers.


> 4. Similar to allowing None to mean type(None) it's nice to have shortcuts
> like:
>     (t1, t2, ...) == Tuple[t1, t2, ...]
>     [t1] == List[t1]
>     {t1: t2} == Dict[t1, t2]
>     {t1} == Set[t1]
>
> The last 3 can be Sequence[t1], Mapping[t1, t2] and collections.Set[t1] if
> we
> want to encourage the use of abstract types.
>

This was proposed as the primary notation during the previous round of
discussions here. You are right that if we propose to "fix up" type
annotations that appear together with a default value we should also be
able in principle to change these shortcuts into the proper generic type
objects. Yet I am hesitant to adopt the suggestion -- people may already be
using e.g. dictionaries as annotations for some other purpose, and there is
the question you bring up whether we should promote these to concrete or
abstract collection types.

Also, I should note that, while I mentioned it as a possibility, I am
hesitant to endorse the shortcut of "arg: t1 = None" as a shorthand for
"arg: Union[t1, None] = None" because it's unclear whether runtime
introspection of the __annotations__ object should return t1 or the
inferred Union object. (The unspoken requirement here is that there will be
no changes to CPython's handling of annotations -- the typing.py module
will be all that is needed, and it can be backported to older Python
versions.)


> 5. Using strings for forward references can be messy in case of generics:
> parsing of brackets etc in the string will be needed. I propose explicit
> forward declarations:
>
> C = Declare('C')
> class C(Generic[X]):
>     def foo(self, other: C[X]): ...
>     def bar(self, other: C[Y]): ...
>

Agreed this is an area that needs more thought. In mypy you can actually
write the entire annotation in string quotes -- mypy has to be able to
parse type expressions anyway (in fact it has to be able to parse all of
Python :-). I do think that the example you present feels rather obscure.


> 6. On the other hand, using strings for unconstrained type variables is
> quite
> handy, and doesn't share the same problem:
>
> def head(xs: List['T']) -> 'T': ...
>

Yeah, it does look quite handy, if the ambiguity with forward references
can be resolved. Also it's no big deal to have to declare a type variable
-- you can reuse them for all subsequent function definitions, and you
usually don't need more than two or three.

-- 
--Guido van Rossum (python.org/~guido)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20141224/8bd34981/attachment.html>


More information about the Python-ideas mailing list