
A few months ago we had a long discussion about type hinting. I've thought a lot more about this. I've written up what I think is a decent "theory" document -- writing it down like this certainly helped *me* get a lot of clarity about some of the important issues. https://quip.com/r69HA9GhGa7J I should thank Jeremy Siek for his blog post about Gradual Typing, Jukka Lehtosalo for mypy (whose notation I am mostly borrowing), and Jim Baker for pushing for an in-person meeting where we all got a better understanding of several issues. There's also a PEP draft, written by Łukasz Langa and revised by him based on notes from the above-mentioned in-person meeting; unfortunately it is still a bit out of date and I didn't have time to update it yet. Instead of working on the PEP, I tried to implement a conforming version of typing.py, for which I also ran out of time -- then I decided to just write up an explanation of the theory. I am still hoping to get a PEP out for discussion in early January, and I am aiming for provisional acceptance by PyCon Montréal, which should allow a first version of typing.py to be included with Python 3.5 alpha 4. If you are wondering how I can possibly meet that schedule: (a) the entire runtime component proposal can be implemented as a single pure-Python module: hence the use of square brackets for generic types; (b) a static type checker is not part of the proposal: you can use mypy, or write your own. -- --Guido van Rossum (python.org/~guido)

On 20 December 2014 at 10:55, Guido van Rossum <guido@python.org> wrote:
This looks like a great direction to me. While I know it's not the primary purpose, a multidispatch library built on top of it could potentially do wonders for cleaning up some of the ugliness in the URL parsing libraries (which have quite a few of those "all str, or all bytes, but not a mixture" style interfaces). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Fri, Dec 19, 2014 at 04:55:37PM -0800, Guido van Rossum wrote:
If anyone else is having trouble reading the page in their browser, this seems to work perfectly for me under Linux: wget --no-check-certificate https://quip.com/r69HA9GhGa7J mv r69HA9GhGa7J r69HA9GhGa7J.html links r69HA9GhGa7J.html Using links https://quip.com/r69HA9GhGa7J also works, if you don't mind reading grey text on a light-grey background. -- Steven

On Fri, Dec 19, 2014 at 04:55:37PM -0800, Guido van Rossum wrote:
Very interesting indeed. Some questions, which you may not have answers to yet :-) (1) Under "General rules", you state "No type can be subclassed unless stated." That's not completely clear to me. I presume you are talking about the special types like Union, Generic, Sequence, Tuple, Any etc. Is that correct? (2) Under "Types", you give an example Tuple[t1, t2, ...], a tuple whose items are instances of t1 etc. To be more concrete, a declaration of Tuple[int, float, str] will mean "a tuple with exactly three items, the first item must be an int, the second item must be a float, the third item must be a string." Correct? (3) But there's no way of declaring "a tuple of any length, which each item is of type t". We can declare it as Sequence[t], but cannot specify that it must be a tuple, not a list. Example: class MyStr(str): def startswith(self, prefix:Union[str, ???])->bool: pass There's nothing I can use instead of ??? to capture the current behaviour of str.startswith. Correct? (4) Under "Pragmatics", you say "Don't use dynamic type expressions; use builtins and imported types only. No 'if'." What's the reason for this rule? Will it be enforced by the compiler? (5) I presume (4) above also applies to things like this: if condition(): X = Var('X', str) else: X = Var('X', bytes) # Later on def spam(arg: X)-> X: ... How about this? try: from condition_is_true import X # str except ImportError: from condition_is_false import X # bytes (6) Under "Generic types", you have: X = Var('X'). Declares a unique type variable. The name must match the variable name. To be clear, X is a type used only for declarations, right? By (1) above, it cannot be instantiated? But doesn't that make it impossible to satisfy the declaration? I must be misunderstanding something. I imagine the declaration X = Var("X") to be something equivalent to: class X(type): pass except that X cannot be instantiated. (7) You have an example: AnyStr = Var('AnyStr', str, bytes) def longest(a: AnyStr, b: AnyStr) -> AnyStr: Would this be an acceptable syntax? def longest(a:str|bytes, b:str|bytes) -> str|bytes It seems a shame to have to invent a name for something you might only use once or twice. -- Steve

On Fri, Dec 19, 2014 at 11:59 PM, Steven D'Aprano <steve@pearwood.info> wrote:
Yes. (Having to answer this is the price I pay for attempting brevity.)
Yes.
Yes, though there's a proposal to let you write Union[str, Tuple[str, ...]] -- the ... are literally that (i.e. Ellipsis).
No, but it will get you on the static checker's nasty list.
Yes.
Probably also. In mypy there is limited support for a few specific tests, IIRC it has PY3 and PY2 conditions built in. In any case this is all just to make the static checker's life easier (since it won't know whether the condition is true or false at runtime).
It's in support of generic types. Read up on them in the mypy docs about generics.
No. See above.
That was proposed and rejected (though for Union, which is slightly different) because it would require changes to the builtin types to support that operator at runtime. Please do read up on generic types in mypy. http://mypy.readthedocs.org/en/latest/generics.html -- --Guido van Rossum (python.org/~guido)

On Sat, Dec 20, 2014 at 11:08 AM, Guido van Rossum <guido@python.org> wrote:
For a very good reason - static type checking means we are evaluating types *statically*, before a program is run. (Otherwise we are doing dynamic type checking. Python already does a good job of that!) The Python compiler is completely oblivious to any of this of course - nothing is changing with it in this proposal for 3.4. Only mypy and other possible static type checkers and related tools like IDEs will complain. Or you might be able to get this information at runtime, which can help support gradual typing.
Note that in theory it's possible to define compile-time expressions, as has been done in C++ 11. ( http://en.wikipedia.org/wiki/Compile_time_function_execution is a good starting point.) Except for possibly the simplest cases, such as baked-in support for Py3/Py2, let's not do that.
The other thing is AnyStr is a type variable, not a type definition to shorten writing out types. So AnyStr acts as a constraint that must be satisfied for the program to type check as valid. (A very simple constraint.) You can think of it as supporting two possible signatures for the longest function from the caller's perspective: def longest(a: str, b: str) -> str def longest(b: bytes, b: bytes) -> bytes But not some mix, this would be invalid if the caller of longest try to use it in this way: def longest(a: str, b: str) -> bytes
Please do read up on generic types in mypy. http://mypy.readthedocs.org/en/latest/generics.html
The linked example of a Stack class demonstrates how the type variable can work as a constraint for a class, across its methods, similar to what one might see in Java or Scala. -- - Jim jim.baker@{colorado.edu|python.org|rackspace.com|zyasoft.com}

Maybe I'm missing something, but wouldn't type hinting as it's defined now break "virtual subclassing" of ABC? For example, given the following code: from collections import Sequence class MySequence: ... Sequence.register(MySequence) it seems to me like the following would work: def foo(bar): if not isinstance(bar, Sequence): raise RuntimeError("Foo can only work with sequences") ... but when rewritten for static type checking def foo(bar: Sequence): .... it would cease to work. At least I don't see a way a static type checker could handle this realiably (the register call might occur anywhere, after all) Is this intentional? Even if this might be more a mypy/implementation question, it should be clear to users of typing.py if they should expect ABCs to break or not

On Sat, Dec 20, 2014 at 2:00 PM, Dennis Brakhane <brakhane@googlemail.com> wrote:
Well, the program will still run fine with CPython, assuming you keep the isinstance() check in your program. The argument declaration does not cause calls with non-Sequence arguments to be rejected, it just guides the static checker (which is a program that you must run separately, in a similar way as a linter -- it is not built into CPython). The static checker may request the call foo(MySequence()) if it cannot see that MySequence is a Sequence. You have two options there: if you can change the code of MySequence, you should just inherit from Sequence rather than using the register() call; if you cannot change the code of MySequence, you should use a *stub* module which declares that MySequence is a Sequence. You can read up on stub modules in the mypy docs: http://mypy.readthedocs.org/en/latest/basics.html#library-stubs -- --Guido van Rossum (python.org/~guido)

The silly question. I use python 3 annotations for argument checking. My assumption is very simple, for def f(arg: checker): pass the checker will raise ValueError or TypeError if arg is not correct. I do it by `checker(arg)` call. I use this in aiozmq rpc (http://aiozmq.readthedocs.org/en/0.5/rpc.html#signature-validation) for example and checkers from trafaret (https://github.com/Deepwalker/trafaret) works fine. Will proposed change break my workflow? On Sun, Dec 21, 2014 at 6:53 PM, Guido van Rossum <guido@python.org> wrote:
-- Thanks, Andrew Svetlov

You're workflow won't necessarily break. You need to use a tool which expects type hints for function annotations to cause you any problems. If you simply don't use such a tool then you will have no problems. On Sun, Dec 21, 2014, 11:55 Andrew Svetlov <andrew.svetlov@gmail.com> wrote:

Sorry, I want to ask again. The proposal is for static checks only? My expectations for processing annotations in runtime as-is (just a mark without any restrictions) will not changed? On Sun, Dec 21, 2014 at 10:11 PM, Brett Cannon <brett@python.org> wrote:
-- Thanks, Andrew Svetlov

The proposal is to standardize how to specify type hints. The *assumption* is that static type checkers will use the type hints. Nothing is being forced to occur at runtime; Guido's proposal is outlining how static tools that consume the type hints and people that use them should expect things to work. On Sun, Dec 21, 2014, 12:32 Andrew Svetlov <andrew.svetlov@gmail.com> wrote:

On 22 December 2014 at 06:32, Andrew Svetlov <andrew.svetlov@gmail.com> wrote:
Correct, there are no changes being proposed to the runtime semantics of annotations. The type hinting proposal describes a conventional use for them that will be of benefit to static type checking systems and integrated development environments, but it will be exactly that: a convention, not an enforced behaviour. The convention of treating "_" prefixed methods and other attributes as private to the implementation of a class or module is a good example of a similar approach. While some things (like pydoc and wildcard imports) will respect the convention, it's not enforced at the core language level - if a developer decides they're prepared to accept the compatibility risk, then they're free to use the "private" attribute if they choose to do so. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Sun, Dec 21, 2014 at 3:47 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
Well... The elephant in the room is that *eventually* other uses of annotations *may* be frowned upon, or may need to be marked by some decorator. But I promise that in Python 3.5 your code will not break -- it just might not be very useful to run a static checker like mypy on it. (IIRC mypy used to only typecheck code that imports the typing.py module, but this seems to have changed.) When we discussed this earlier this year, a few other uses of annotations were brought up, and some have proposed that static type annotations would need to be marked by a decorator. There is even a proposed syntax that allows multiple annotations to coexist for the same argument (a dict with fixed keys -- to me it looks pretty ugly though). I would really like to wait and see how this plays out -- the proposal I'm working on is careful not to have any effect at runtime (as long as the typing.py module can be imported and as long as the annotation expressions don't raise exceptions), and use of a static checker is entirely optional and voluntary. Perhaps the PEP should define some way to tell the type checker not to follow certain imports? That would be useful in case you have a program that tries to follow the annotation conventions for static checking but imports some library that uses annotations for a different purpose. You can probably do this already for mypy by writing a stub module. -- --Guido van Rossum (python.org/~guido)

On 22 December 2014 at 14:05, Guido van Rossum <guido@python.org> wrote:
Agreed - I see this as a good, incremental evolution from the increased formality and flexibility in the type system that was introduced with ABCs, and I think that's been successful in letting folks largely not need to worry about them unless they actually need to deal with the problems they solve (like figuring out whether a container is a Sequence or Mapping). Ideally we'll get the same result here - folks that have problems where type annotations can help will benefit, while those that don't need them won't need to worry about them.
For inline use, it may be worth defining a module level equivalent to pylint's "#pylint: skip-file" comments (rather than having each static checker come up with its own way of spelling that). Aside from that, it may also be worth recommending that static type checkers provide clear ways to control the scope of scanning (e.g. by allowing particular directories and files to be excluded, or limit scans to particular directories. In both cases, the PEP would likely need to define the implied annotations to be assumed for excluded modules. I realised in trying to write this email that I don't currently understand the consequences of not having annotation data available for a module in terms of the ripple effects that may have on what scanners can check - from the end user perspective, I believe that's something I'd need to know, even though I wouldn't necessarily need to know *why* those were the default assumptions for unannotated operations. (I'm curious about the latter from a language *design* perspective, but I think I'd be able to use the feature effectively just by knowing the practical consequences without necessarily understanding the theory) Regards, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Sun, Dec 21, 2014 at 9:35 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
That's what "Any" is for. If an object's type is "Any", then every operation on it (including getattr) also return "Any", from the checker's POV. You stop the ripples with an explicitly declared type -- e.g. if you pass it to a function as an arg with a type annotation, inside that function the argument is assumed to have the declared type (but the "Any" prevents diagnostics about the call site). In mypy there is also an explicit cast operator (http://mypy.readthedocs.org/en/latest/casts.html), we may add this to the PEP (it's one of the many details left out from the Quip doc that need to be filled in for the actual PEP). The importance of "Any" cannot be overstated -- without it, you either have to declare types everywhere, or you end up with a "everything inherits from everything" situation. The invention of "Any" prevents this (and this is why is-consistent-with cannot be transitive). Read Jeremy Siek's Gradual Typing blog post for more. Also, consider the important difference between Any and object. They are both at the top of the class tree -- but object has *no* operations (well, almost none -- it has repr() and a few others), while Any supports *all* operations (in the sense of "is allowed by the type system/checker"). This places Any also at the *bottom* of the class tree, if you can call it that. (And hence it is more a graph than a tree -- but if you remove Any, what's left is a tree again.) Any's "contagiousness" is somewhat similar to NaN for floats, but it has its limits -- e.g. repr() of Any is still a string, and Any==Any is still a bool. Another similar concept exists IIUC in Objective-C, which has a kind of null object that can be called (sent a message), always returning another null result. (This has occasionally been proposed for Python's None, but I don't think it is appropriate.) But of course, Any only exists in the mind of the static checker -- at runtime, the object has a concrete type. "Any" is just the type checker's way to say "I don't know the type, and I don't care". I bet compilers with good error recovery have some similar concept internally, used to avoid a cascade of errors due to a single typo. -- --Guido van Rossum (python.org/~guido)

On Mon, Dec 22, 2014 at 8:02 AM, Guido van Rossum <guido@python.org> wrote: [...]
limits -- e.g. repr() of Any is still a string, and Any==Any is still a bool.
Why is Any==Any a bool? Comparison operators can return anything, and libraries like Numpy or SQLAlchemy take advantage of this (comparing Numpy arrays results in an array of bools, and comparing a SQLAlchemy Column to something results in a comparison expression, e.g. `query.filter(table.id == 2)`). Would type checkers be expected to reject these uses?
Why is this limited to None? Can this be extended to "if there's a default argument, then that exact object is also allowed"? That would allow things like: _MISSING = object() def getattr(key: str, default: VT=_MISSING) -> VT: ...

On Mon, Dec 22, 2014 at 3:54 AM, Petr Viktorin <encukou@gmail.com> wrote:
Never mind, I forgot about that. :-)
Well, that would require the type checker to be smarter. I don't want to go down the slippery path of defining compile-time expressions. Allowing None is easy (it's a keyword), useful and covers the common cases. -- --Guido van Rossum (python.org/~guido)

On 22 Dec 2014 17:02, "Guido van Rossum" <guido@python.org> wrote:
Also, consider the important difference between Any and object. They are
both at the top of the class tree -- but object has *no* operations (well, almost none -- it has repr() and a few others), while Any supports *all* operations (in the sense of "is allowed by the type system/checker"). Ah, this was the key piece I had missed, both from your article and Jeremy's: since Any conceptually allows all operations, with a result of Any, there's no need to separately represent "callable that permits arbitrary arguments, returning a result of unknown type". Very nice! Regards, Nick.

On Dec 21, 2014, at 23:02, Guido van Rossum <guido@python.org> wrote:
Any's "contagiousness" is somewhat similar to NaN for floats, but it has its limits -- e.g. repr() of Any is still a string, and Any==Any is still a bool. Another similar concept exists IIUC in Objective-C, which has a kind of null object that can be called (sent a message), always returning another null result. (This has occasionally been proposed for Python's None, but I don't think it is appropriate.)
I think you're mixing up two things here. ObjC does have a concept pretty close to Any, but it's not nil, it's id. This is a type that's never seen at runtime, but at compile time, in practice,* it's both a top and bottom type** (any method can be called on an object declared as id; any argument can be passed to a covariant or contravariant parameter of type id; a value declared id can be stored or passed anywhere), and subtyping is transitive except as far as id is concerned. But nil is a whole different thing. Dynamically, it's similar to Python's None, except that it responds to every method with None. Statically, every type is effectively an Optional[T], and can hold either a T (or subclass) instance or nil.*** The static type of nil might be id, but it doesn't really matter, because that never comes up.**** The dynamic type of nil is nil.***** * In reality, this relies on the fact that ObjC objects are always referenced by C pointers, and id is effectively just a typedef for void*, and C allows any pointer to be implicitly cast to and from void*. ** Except when native C types get involved. Traditional ObjC waved that away by assuming that every C type except double and long long was the same size as a pointer and you don't use those two very often; the 64-bit transition screwed that up, and the compiler no longer lets you mix up ObjC and native C types. *** Again, this is because you always reference ObjC objects as C pointers, and C pointers can always accept null values. **** Unless you explicitly use the compile-time typeof or sizeof operators on nil, which there's no good reason to do. If you do, then you'll see it's defined as (id)0. ***** Because the equivalent of the type function is a method.

On Wed, Dec 24, 2014 at 3:22 PM, Eugene Toder <eltoder@gmail.com> wrote:
Thanks, that was a simple "thinko". I meant that without Any (and perhaps barring some other pathological cases due to Python's extreme malleability) it's a DAG, but with Any it has a cycle. I've also updated the Quip doc to speak of "class graph". -- --Guido van Rossum (python.org/~guido)

On 12/21/2014 08:05 PM, Guido van Rossum wrote:
What I ended up doing for my scription program was to move the annotations outside the def, and store them in a different attribute, applied with a decorator: @Command( file=('source file', ), dir=('target directory', ), options=('extra options', MULTI, ), ) def copy(file, dir, options): pass copy.__scription__ --> {'file':..., 'dir':..., 'options':...} -- ~Ethan~

A few months ago we had a long discussion about type hinting. I've
Guido van Rossum <guido@...> writes: thought a
(I apologize in advance if some of this was covered in previous discussions). 1. Since there's the Union type, it's also natural to have the Intersection type. A class is a subclass of Intersection[t1, t2, ...] if it's a subclass of all t1, t2 etc. The are 2 main uses of the Intersection type: a) Require that an argument implements multiple interfaces: class Foo: @abstractmethod def foo(self): ... class Bar: @abstractmethod def bar(self): ... def fooItWithABar(obj: Intersection[Foo, Bar]): ... b) Write the type of an overloaded function: @overload def foo(x: str) -> str: ... @overload def foo(x: bytes) -> bytes: ... foo # type: Intersection[Callable[[str], str], Callable[[bytes], bytes]] 2. Constrained type variables (Var('Y', t1, t2, ...)) have a very unusual behavior. a) "subclasses of t1 etc. are replaced by the most-derived base class among t1 etc." This defeats the very common use of constrained type variables: have a type preserving function limited to classes inherited from a common base. E.g. say we have a function: def relocate(e: Employee) -> Employee: ... The function actually always returns an object of the same type as the argument, so we want to write a more precise type. We usually do it like this: XEmployee = Var('XEmployee', Employee) def relocate(e: XEmployee) -> XEmployee: ... This won't work with the definition from the proposal. b) Multiple constraints introduce an implicit Union. I'd argue that type variables are more commonly constrained by an Intersection rather than a Union. So it will be more useful if given this definition Y has to be compatible with all of t1, t2 etc, rather than just one of them. Alternatively, this can be always spelled out explicitly: Y1 = Var('Y1', Union[t1, t2, ...]) Y2 = Var('Y2', Intersection[t1, t2, ...]) Pragmatics: 3. The names Union and Intersection are standard terminology in type checking, but may not be familiar to many Python users. Names like AnyOf[] and AllOf[] can be more intuitive. 4. Similar to allowing None to mean type(None) it's nice to have shortcuts like: (t1, t2, ...) == Tuple[t1, t2, ...] [t1] == List[t1] {t1: t2} == Dict[t1, t2] {t1} == Set[t1] The last 3 can be Sequence[t1], Mapping[t1, t2] and collections.Set[t1] if we want to encourage the use of abstract types. 5. Using strings for forward references can be messy in case of generics: parsing of brackets etc in the string will be needed. I propose explicit forward declarations: C = Declare('C') class C(Generic[X]): def foo(self, other: C[X]): ... def bar(self, other: C[Y]): ... 6. On the other hand, using strings for unconstrained type variables is quite handy, and doesn't share the same problem: def head(xs: List['T']) -> 'T': ... Regards, Eugene

On Wed, Dec 24, 2014 at 4:50 PM, Eugene Toder <eltoder@gmail.com> wrote:
No problem. :-) I apologize for reformatting the text I am quoting from you, it looked as if it was sent through two different line clipping functions.
Yes, we even have an issue to track this proposal. I don't recall who suggested it first. I don't know if it poses any problems to the static checked (though I doubt it). https://github.com/ambv/typehinting/issues/18
The static checker can figure that out for itself, but that doesn't mean we necessarily need a way to spell it.
I just copied this from mypy (where it is called typevar()). I guess in that example one would use an *unconstrained* type variable. The use case for the behavior I described is AnyStr -- if I have a function like this I don't want the type checker to assume the more precise type: def space_for(s: AnyStr) -> AnyStr: if isinstance(s, str): return ' ' else: return b' ' If someone defined a class MyStr(str), we don't want the type checker to think that space_for(MyStr(...)) returns a MyStr instance, and it would be impossible for the function to even create an instance of the proper subclass of str (it can get the class object, but it can't know the constructor signature). For strings, functions like this (which return some new string of the same type as the argument, constrained to either str or bytes) are certainly common. And for your Employee example it would also seem problematic for the function to know how to construct an instance of the proper (dynamically known) subclass. b) Multiple constraints introduce an implicit Union. I'd argue that type
Well, maybe. At this point you'd have to point us to a large body of evidence -- mypy has done well so far with its current definition of typevar(). OTOH one of the ideas on the table is to add keyword options to Var(), which might make it possible to have type variables with different semantics. There are other use cases, some of which are discussed in the tracker: https://github.com/ambv/typehinting/issues/18
I strongly disagree with this. Python's predecessor, ABC, used a number of non-standard terms for common programming language concepts, for similar reasons. But the net effect was just that it looked weird to anyone familiar with other languages, and for the users who were a completely blank slate, well, "HOW-TO" was just as much jargon that they had to learn as "procedure". Also, the Python users who will most likely need to learn about this stuff are most likely library developers.
This was proposed as the primary notation during the previous round of discussions here. You are right that if we propose to "fix up" type annotations that appear together with a default value we should also be able in principle to change these shortcuts into the proper generic type objects. Yet I am hesitant to adopt the suggestion -- people may already be using e.g. dictionaries as annotations for some other purpose, and there is the question you bring up whether we should promote these to concrete or abstract collection types. Also, I should note that, while I mentioned it as a possibility, I am hesitant to endorse the shortcut of "arg: t1 = None" as a shorthand for "arg: Union[t1, None] = None" because it's unclear whether runtime introspection of the __annotations__ object should return t1 or the inferred Union object. (The unspoken requirement here is that there will be no changes to CPython's handling of annotations -- the typing.py module will be all that is needed, and it can be backported to older Python versions.)
Agreed this is an area that needs more thought. In mypy you can actually write the entire annotation in string quotes -- mypy has to be able to parse type expressions anyway (in fact it has to be able to parse all of Python :-). I do think that the example you present feels rather obscure.
Yeah, it does look quite handy, if the ambiguity with forward references can be resolved. Also it's no big deal to have to declare a type variable -- you can reuse them for all subsequent function definitions, and you usually don't need more than two or three. -- --Guido van Rossum (python.org/~guido)

On Wed, Dec 24, 2014 at 08:16:52PM -0800, Guido van Rossum wrote:
I presume that runtime name binding will be allowed, e.g. from typing import Union as AnyOf def spam(x: AnyOf[int, float])->str: ... but not encouraged. (It's not that hard to learn a few standard names like Union.) So the above would work at runtime, but at compile time, it will depend on the specific linter or type checker: it will be a "quality of implementation" issue, with simple tools possibly not being able to recognise AnyOf as being the same as Union. Is this what you have in mind? -- Steven

On Thu, Dec 25, 2014 at 6:43 AM, Steven D'Aprano <steve@pearwood.info> wrote:
I would not recommend that to anyone -- I find that use of "import ... as" is often an anti-pattern or a code smell, and in this case it would seem outright silly to fight the standard library's terminology (assuming typing.py defines Union). I don't know if mypy supports this (it's easy to try it for yourself though) but I do know it follows simple global assignments, known as type aliases, e.g. "foo = Iterator[int]". -- --Guido van Rossum (python.org/~guido)

On Wed, Dec 24, 2014 at 11:16 PM, Guido van Rossum <guido@python.org> wrote:
I thought more about this, and I think I understand what you are after. The syntax confused me somewhat. I also recalled that type variables may need lower bounds in addition to upper bounds. Is AnyStr the main use case of this feature? If that's the case, there are other ways to achieve the same effect with more general features. First, some motivational examples. Say we have a class for an immutable list: class ImmutableList(Generic[X]): def prepend(self, item: X) -> ImmutableList[X]: ... The type of prepend is actually too restrictive: we should be able to add items that are superclass of X and get a list of that more general type: Y = Var('Y') >= X # must be a superclass of X class ImmutableList(Generic[X]): def prepend(self, item: Y) -> ImmutableList[Y]: ... Alternative syntax for Y, based on the Java keyword: Y = Var('Y').super(X) This will be handy to give better types to some methods of tuple and frozenset. Next, let's try to write a type for copy.copy. There are many details that can be changed, but here's a sketch. Naturally, it should be def copy(obj: X) -> X: ... But copy doesn't work for all types, so there must be some constraints on X: X = Var('X') X <= Copyable[X] # must be a subclass of Copyable[X] (Alternative syntax: X.extends(Copyable[X]); this also shows why constraints are not listed in the constructor.) Copyable is a protocol: @protocol class Copyable(Generic[X]): def __copy__(self: X) -> X: ... And for some built-in types: Copyable.register(int) Copyable.register(str) ... This approach can be used to type functions that special-case built-in types, and rely on some known methods for everything else. In my example with XEmployee the function could either return its argument, or make a copy -- the Employee class can require all its subclasses to implement some copying protocol (e.g. a copy() method). In fact, since the longest() function from your document always returns one of its arguments, its type can be written as: X = Var('X') <= Union[str, bytes] def longest(a: X, b: X) -> X: ... that is, it doesn't need to restrict the return type to str or bytes :-) Finally, the feature from your document: AnyStr = Var('AnyStr').restrictTo(str, bytes) # the only possible values However, this can be achieved by adding more useful features to protocols: # explicit_protocol is a non-structural protocol: only explicitly registered # types are considered conforming. This is very close to type classes. # Alternatively, protocols with no methods can always be explicit. @explicit_protocol class StringLike(Generic[X]): # This type can be referenced like a class-level attribute. # The name "type" is not special in any way. type = X StringLike.register(str) StringLike.register(bytes) AnyStr = Var('AnyStr') AnyStr <= StringLike[AnyStr] AnyStrRet = StringLike[AnyStr].type def space_for(x: AnyStr) -> AnyStrRet: ... There are many details that can be tweaked, but it is quite powerful, and solves the simpler problem as well. that use union type seem to use t1|t2 syntax for it. AFAIU this syntax was rejected to avoid changes in CPython. This is a shame, because it is widespread and reads really well: def foo(x: Some|Another): ... Also, Type|None is so short and clear that there's no need for the special Optional[] shorthand. Given we won't use |, I think def foo(x: AnyOf[Some, Another]): ... reads better than def foo(x: Union[Some, Another]): ... but this may be getting into the bikeshedding territory :-) these annotations to proper generic types. This should be done internally in the type checker. If we want other tools to understand this syntax, we can expose functions typing.isTypeAnnotation(obj) and typing.canonicalTypeAnnotation(obj). With this approach, I don't believe this use of lists and dicts adds any more problems for the existing uses of annotations. The decision of whether to use concrete or abstract types is likely not a hard one. Given my experience, I'd use concrete types, because they are so common. But this does depend on the bigger context of how annotations are expected to be used.
class Set(Generic[X]): def union(self, other: Set[X]) -> Set[X]: ...
While on the subject: what are the scoping rules for type variables? I hope they are lexically scoped: the names used in the enclosing class or function are considered bound to those values, rather than fresh variables that shadow them. I used this fact in the examples above. E.g. union() above accepts only the sets with the same elements, not with any elements, and in def foo(x: X) -> X: def bar(y: X) -> X: return y return bar(x) X in bar() must be the same type as in foo(). Eugene

On Thu, Dec 25, 2014 at 1:49 PM, Eugene Toder <eltoder@gmail.com> wrote:
mypy solves that using @overload in a stub file. That's often more precise.
Hm, looks like the case for Intersection is still pretty weak. Anyway, we can always add stuff later. But whatever we add in 3.5 we cannot easily take back.
Yes, that's the issue I meant.
I don't know if this is the main use case (we should ask Jukka when he's back from vacation). I'm hesitant to propose more general features without at least one implementation. Perhaps you could try to see how easy those more general features would be implementable in mypy?
Neither syntax is acceptable to me, but let's assume we can do this with some other syntax. Your example still feels like it was carefully constructed to prove your point -- it would make sense in a language where everything is type-checked and types are the basis for everything, and users are eager to push the type system to its limits. But I'm carefully trying to avoid moving Python in that direction.
This will be handy to give better types to some methods of tuple and frozenset.
I assume you're talking about the case where e.g. I have a frozenset of Managers and I use '+' to add an Employee; we then know that the result is a frozenset of Employees. But if we assume covariance, that frozenset of Managers is also a frozenset of Employees, so (assuming we have a way to indicate covariance) the type-checker should be able to figure this out. Or are you perhaps trying to come up with a way to spell covariance? (The issue #2 above has tons of discussion about that, although I don't think it comes to a clear conclusion.)
Next, let's try to write a type for copy.copy.
Eek. That sounds like a bad idea -- copy.copy() uses introspection and I don't think there's much hope to be able to spell its type. (Also I usually consider the use of copy.copy() a code smell. Perhaps there's a connection. :-)
Sorry, I'm not sold on this. I also worry that the register() calls are hard to track for a type checker -- but that's minor (I actually don't know if this would be a problem for mypy). I just don't see the point in trying to create a type system powerful enough to describe copy.copy().
That sounds like an artificial requirement on the implementation designed to help the type checker. I'm inclined to draw the line well before that point. (Otherwise Raymond Hettinger would throw a fit. :-)
In fact, since the longest() function from your document always returns one of its arguments,
But that was just the shortest way to write such an example. The realistic examples (e.g. URL parsing or construction) aren't that simple.
Now you're just wasting my time. :-)
I'm afraid you've lost me. But (as you may have noticed) I'm not really the one you should be convincing -- if you can convince Jukka to (let you) add something like this to mypy you may have a better case. Even so, I want to limit the complexity of what we add to Python 3.5 -- TBH basic generic types are already pushing the limits. I would much rather be asked to add more stuff to 3.6 than to find out that we've added so much to 3.5 that people can't follow along. Peter Norvig mentioned that the subtleties of co/contra-variance of generic types in Java were too complex for his daughter, and also reminded me that Josh Bloch has said somewhere that he believed they made it too complex.
Yes, but we're not going to change it, and it will be fine.
Right. :-)
But I can see a serious downside as well. There will likely be multiple tools that have to be able to read the type hinting annotations, e.g. IDEs may want to use the type hints (possibly from stub files) for code completion purposes. Also someone might want to write a decorator that extracts the annotations and asserts that arguments match at run time. The more handy shorthands we invent, the more complex all such tools will have to be.
That's how I'm leaning as well.
You may just have killed the idea. Let's keep it simpler.
I know. :-)
How complex does it really have to be? Perhaps Name[Name, Name, ...] is the only form (besides a plain Name) that we really need? Anything more complex can probably be reduced using type aliases. Then again my earlier argument is clearly for keeping things simple, and perhaps an explicit forward declaration is simpler. The run-time representation would still be somewhat problematic. I'll try to remember to report back once I have tried to implement this.
I don't think it's quite a toss-up. A type variable is a special feature. But a forward reference is not much different from a backward reference -- you could easily imagine a language (e.g. C++ :-) where forward references don't require special syntax. The rule that 'X' means the same as X but is evaluated later is pretty simple, whereas the rule the 'X' introduces a type variable is pretty complex. So even if we *didn't* use string quotes for forward references I still wouldn't want to use that syntax for type variables.
Why don't you install mypy and check for yourself? (I expect it's as you desire, but while I have mypy installed, I'm on vacation and my family is asking for my attention.) -- --Guido van Rossum (python.org/~guido)

On Thu, Dec 25, 2014 at 10:41 PM, Guido van Rossum <guido@python.org> wrote:
The real Set cannot be covariant, though, because it supports mutation.
def copy(obj): if isinstance(obj, int): return obj if isinstance(obj, list): return list(obj) ... return obj.__copy__() This does not seem very hard to type. There are much simpler examples, though: a) Keys of Dict and elements of Set must be Hashable, b) To use list.index() list elements must be Comparable, c) Arguments to min() and max() must be Ordered, d) Arguments to sum() must be Addable. So it's not uncommon to have generic functions that need restrictions on type variables. produce the value. programmers to understand, and made all generics in Dart covariant. This was also the case in Beta, whose authors denounced invariance and contravariance, as coming from people "with type-checking background" :-) limited -- there are only as many literals in Python.
Eugene

On Fri, Dec 26, 2014 at 12:00 PM, Eugene Toder <eltoder@gmail.com> wrote:
Well, it copies most class instances by just copying the __dict__. And it recognizes a bunch of other protocols (__copy__ and most pickling interfaces).
I think you are still trying to design a type system that can express all constraints exactly. In practice I doubt if any of the examples you mention here will help catch many bugs in actual Python code; a type checker that is blissfully unaware of these requirements will still be tremendously useful. (I guess this is the point of gradual typing.)
Yeah, but my counter is that Python users today don't write classes like that, and I don't want them to have to change their habits.
I'm not sure I understand why you think that is funny. I think they all have a point.
I think there are at least three separate use cases (note that none are part of the proposal -- the proposal just enables a single notation to be used for all three): (1) Full type checkers like mypy. These have to parse everything without ever running it, so they cannot use the typing module's primitives. They may also have to parse stuff in comments (there are several places where mypy needs a little help and the best place to put it is often in a #type: comment), which rules out Python's ast module. (2) Things that use runtime introspection, e.g. decorators that try to enforce run time correctness. These can use the typing module's primitives. I wish we could just always have (generic) type objects in the annotations, so they could just look up the annotation and then use isinstance(), but I fear that forward refs will spoil that simplicity anyway. (3) IDEs. These typically need to be able to parse code that contains errors. So they end up having their own, more forgiving parser. That's enough distinct cases to make me want to compromise towards a slightly more verbose syntax that requires less special handling. (I am also compromising because I don't want to change CPython's parser and I want to be able to backport typing.py to Python 3.4 and perhaps even 3.3.) -- --Guido van Rossum (python.org/~guido)

On Dec 25, 2014, at 22:49, Eugene Toder <eltoder@gmail.com> wrote:
I'm not sure this problem exists. The builtin set (and therefore the undocumented MyPy/typing TypeAlias Set) had a union method, but its signature is not that restrictive. It takes 1 or more arbitrary iterables of any element type, and of course it returns a set whose element type is the union of the element types of self and those. And the same is true in general for all of the builtin abstract and concrete types. So, the fact that Guido's/Jukka's proposal doesn't make it easy to define types that are more restrictive than you'd normally want to use in Python doesn't seem to be a problem. Sure, if you wanted to define more restricted C++/Swift/Haskell style collections for Python your want to be able to type them as easily as in those languages... But why do you want to define those collections? The _opposite_ problem--that it's hard to define the _actual_ type of set.union or similarly highly parameterized types--may be more serious, but Guido acknowledged that one long ago, and I think he's right that it seems like the kind of thing that could be added later. (In the initial MyPy for 3.5 you can always define a one-argument version set[X].union(Iterable[Y])->set[Union[X, Y] and a generic multi-argument overload that, say, treats all the Iterable[Any] and returns Set[Any]. If we turn out to need parameter schemas for varargs in real programs, that can surely be added in 3.6 as easily as it could now. And hopefully it won't be needed. You need some kind of complete language to write such schemas in, and C++11 is a nice warning of what it looks like to try to do that declaratively in a non-declarative language.)

On Fri, Dec 26, 2014 at 10:26 AM, Andrew Barnert <abarnert@yahoo.com> wrote: than with a lower bound. So we don't need lower bounds on type variables for collection methods, and maybe at all.
The _opposite_ problem--that it's hard to define the _actual_ type of set.union or similarly highly parameterized types--may be more serious,
Why is it hard? Isn't the actual type just: def union(self, *others: Iterable[Y]) -> Set[Union[X, Y]] where typing of vararg is similar to Java -- all elements must conform to the single type annotation. Also note that I posted set.union method as an example that needs a forward reference to a generic class. I was arguing that if we use strings for forward references, we'll eventually have complicated expressions in those strings, not just class names: class Set(Generic[X]): # Note that the name "Set" is not yet available, so we have to use # a forward reference. This puts the whole return type inside a string. def union(self, *others: Iterable[Y]) -> "Set[Union[X, Y]]": ... Your type for set.union seems to prove the point even better than what I used. Eugene

On Dec 26, 2014, at 18:59, Eugene Toder <eltoder@gmail.com> wrote:
I'm not sure you can just skip over that last point without addressing it. In this case, given iterables of element types Y1, Y2, ..., Yn, you can say that they're all type Iterable[Union[Y1, Y2, ..., Yn]]. I _think_ an inference engine can find that Union type pretty easily, and I _think_ that at least for collection methods there won't be any harder problems--but I wouldn't just assume either of those without looking carefully. And it certainly isn't true when we go past collection methods--clearly map and zip can't be handled this way.
Another way to write this, assuming that Set[X][Y] means Set[Y] or that there's some syntax to get from Set[X] to Set, would be to use a typeof(self) operator. Or a special magic __class_being_defined__ constant instead of an operator, or the normal type function with a slightly different meaning at compile time than runtime, or probably other ways to bikeshed it. The point is, at least this example only really needs the type of self, not an arbitrary forward declaration or an expression that has to be crammed into a string. Are there any good examples where that isn't true? Also, should Set.union be contravariant in the generic type Set, or is it always going to return a Set[Something]? The two options there could both easily be handled by type expressions, or maybe with explicit forward declarations, but with implicit forward declaration via string? I know Guido doesn't want to start allowing arbitrary expressions, but a compile-time typeof operator is a pretty simple special case; even pre-ISO C++ had that.

On Fri, Dec 26, 2014 at 2:15 PM, Andrew Barnert <abarnert@yahoo.com> wrote: the type of the expression [Y1, Y2, ...] is List[Union[Y1, Y2, ...]]. Alternatively, the function call is typed as if the argument was replicated the number of times equal to the number of actual arguments. Both ways should give the same result, and are already supported in the type checker. This seems intuitive, matches your analysis above, and implemented in at least C#, Java and Scala, so there's a good evidence that this is quite usable. the arguments types. You can go further, and say that str.format() type needs to parse the format string to determine the number and the types of what goes into varargs. I think the simple "all of the same type" rule is good enough to type the majority of uses of varargs, except for argument forwarding into a call. At least it's better than nothing.
Eugene

On 20 December 2014 at 10:55, Guido van Rossum <guido@python.org> wrote:
This looks like a great direction to me. While I know it's not the primary purpose, a multidispatch library built on top of it could potentially do wonders for cleaning up some of the ugliness in the URL parsing libraries (which have quite a few of those "all str, or all bytes, but not a mixture" style interfaces). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Fri, Dec 19, 2014 at 04:55:37PM -0800, Guido van Rossum wrote:
If anyone else is having trouble reading the page in their browser, this seems to work perfectly for me under Linux: wget --no-check-certificate https://quip.com/r69HA9GhGa7J mv r69HA9GhGa7J r69HA9GhGa7J.html links r69HA9GhGa7J.html Using links https://quip.com/r69HA9GhGa7J also works, if you don't mind reading grey text on a light-grey background. -- Steven

On Fri, Dec 19, 2014 at 04:55:37PM -0800, Guido van Rossum wrote:
Very interesting indeed. Some questions, which you may not have answers to yet :-) (1) Under "General rules", you state "No type can be subclassed unless stated." That's not completely clear to me. I presume you are talking about the special types like Union, Generic, Sequence, Tuple, Any etc. Is that correct? (2) Under "Types", you give an example Tuple[t1, t2, ...], a tuple whose items are instances of t1 etc. To be more concrete, a declaration of Tuple[int, float, str] will mean "a tuple with exactly three items, the first item must be an int, the second item must be a float, the third item must be a string." Correct? (3) But there's no way of declaring "a tuple of any length, which each item is of type t". We can declare it as Sequence[t], but cannot specify that it must be a tuple, not a list. Example: class MyStr(str): def startswith(self, prefix:Union[str, ???])->bool: pass There's nothing I can use instead of ??? to capture the current behaviour of str.startswith. Correct? (4) Under "Pragmatics", you say "Don't use dynamic type expressions; use builtins and imported types only. No 'if'." What's the reason for this rule? Will it be enforced by the compiler? (5) I presume (4) above also applies to things like this: if condition(): X = Var('X', str) else: X = Var('X', bytes) # Later on def spam(arg: X)-> X: ... How about this? try: from condition_is_true import X # str except ImportError: from condition_is_false import X # bytes (6) Under "Generic types", you have: X = Var('X'). Declares a unique type variable. The name must match the variable name. To be clear, X is a type used only for declarations, right? By (1) above, it cannot be instantiated? But doesn't that make it impossible to satisfy the declaration? I must be misunderstanding something. I imagine the declaration X = Var("X") to be something equivalent to: class X(type): pass except that X cannot be instantiated. (7) You have an example: AnyStr = Var('AnyStr', str, bytes) def longest(a: AnyStr, b: AnyStr) -> AnyStr: Would this be an acceptable syntax? def longest(a:str|bytes, b:str|bytes) -> str|bytes It seems a shame to have to invent a name for something you might only use once or twice. -- Steve

On Fri, Dec 19, 2014 at 11:59 PM, Steven D'Aprano <steve@pearwood.info> wrote:
Yes. (Having to answer this is the price I pay for attempting brevity.)
Yes.
Yes, though there's a proposal to let you write Union[str, Tuple[str, ...]] -- the ... are literally that (i.e. Ellipsis).
No, but it will get you on the static checker's nasty list.
Yes.
Probably also. In mypy there is limited support for a few specific tests, IIRC it has PY3 and PY2 conditions built in. In any case this is all just to make the static checker's life easier (since it won't know whether the condition is true or false at runtime).
It's in support of generic types. Read up on them in the mypy docs about generics.
No. See above.
That was proposed and rejected (though for Union, which is slightly different) because it would require changes to the builtin types to support that operator at runtime. Please do read up on generic types in mypy. http://mypy.readthedocs.org/en/latest/generics.html -- --Guido van Rossum (python.org/~guido)

On Sat, Dec 20, 2014 at 11:08 AM, Guido van Rossum <guido@python.org> wrote:
For a very good reason - static type checking means we are evaluating types *statically*, before a program is run. (Otherwise we are doing dynamic type checking. Python already does a good job of that!) The Python compiler is completely oblivious to any of this of course - nothing is changing with it in this proposal for 3.4. Only mypy and other possible static type checkers and related tools like IDEs will complain. Or you might be able to get this information at runtime, which can help support gradual typing.
Note that in theory it's possible to define compile-time expressions, as has been done in C++ 11. ( http://en.wikipedia.org/wiki/Compile_time_function_execution is a good starting point.) Except for possibly the simplest cases, such as baked-in support for Py3/Py2, let's not do that.
The other thing is AnyStr is a type variable, not a type definition to shorten writing out types. So AnyStr acts as a constraint that must be satisfied for the program to type check as valid. (A very simple constraint.) You can think of it as supporting two possible signatures for the longest function from the caller's perspective: def longest(a: str, b: str) -> str def longest(b: bytes, b: bytes) -> bytes But not some mix, this would be invalid if the caller of longest try to use it in this way: def longest(a: str, b: str) -> bytes
Please do read up on generic types in mypy. http://mypy.readthedocs.org/en/latest/generics.html
The linked example of a Stack class demonstrates how the type variable can work as a constraint for a class, across its methods, similar to what one might see in Java or Scala. -- - Jim jim.baker@{colorado.edu|python.org|rackspace.com|zyasoft.com}

Maybe I'm missing something, but wouldn't type hinting as it's defined now break "virtual subclassing" of ABC? For example, given the following code: from collections import Sequence class MySequence: ... Sequence.register(MySequence) it seems to me like the following would work: def foo(bar): if not isinstance(bar, Sequence): raise RuntimeError("Foo can only work with sequences") ... but when rewritten for static type checking def foo(bar: Sequence): .... it would cease to work. At least I don't see a way a static type checker could handle this realiably (the register call might occur anywhere, after all) Is this intentional? Even if this might be more a mypy/implementation question, it should be clear to users of typing.py if they should expect ABCs to break or not

On Sat, Dec 20, 2014 at 2:00 PM, Dennis Brakhane <brakhane@googlemail.com> wrote:
Well, the program will still run fine with CPython, assuming you keep the isinstance() check in your program. The argument declaration does not cause calls with non-Sequence arguments to be rejected, it just guides the static checker (which is a program that you must run separately, in a similar way as a linter -- it is not built into CPython). The static checker may request the call foo(MySequence()) if it cannot see that MySequence is a Sequence. You have two options there: if you can change the code of MySequence, you should just inherit from Sequence rather than using the register() call; if you cannot change the code of MySequence, you should use a *stub* module which declares that MySequence is a Sequence. You can read up on stub modules in the mypy docs: http://mypy.readthedocs.org/en/latest/basics.html#library-stubs -- --Guido van Rossum (python.org/~guido)

The silly question. I use python 3 annotations for argument checking. My assumption is very simple, for def f(arg: checker): pass the checker will raise ValueError or TypeError if arg is not correct. I do it by `checker(arg)` call. I use this in aiozmq rpc (http://aiozmq.readthedocs.org/en/0.5/rpc.html#signature-validation) for example and checkers from trafaret (https://github.com/Deepwalker/trafaret) works fine. Will proposed change break my workflow? On Sun, Dec 21, 2014 at 6:53 PM, Guido van Rossum <guido@python.org> wrote:
-- Thanks, Andrew Svetlov

You're workflow won't necessarily break. You need to use a tool which expects type hints for function annotations to cause you any problems. If you simply don't use such a tool then you will have no problems. On Sun, Dec 21, 2014, 11:55 Andrew Svetlov <andrew.svetlov@gmail.com> wrote:

Sorry, I want to ask again. The proposal is for static checks only? My expectations for processing annotations in runtime as-is (just a mark without any restrictions) will not changed? On Sun, Dec 21, 2014 at 10:11 PM, Brett Cannon <brett@python.org> wrote:
-- Thanks, Andrew Svetlov

The proposal is to standardize how to specify type hints. The *assumption* is that static type checkers will use the type hints. Nothing is being forced to occur at runtime; Guido's proposal is outlining how static tools that consume the type hints and people that use them should expect things to work. On Sun, Dec 21, 2014, 12:32 Andrew Svetlov <andrew.svetlov@gmail.com> wrote:

On 22 December 2014 at 06:32, Andrew Svetlov <andrew.svetlov@gmail.com> wrote:
Correct, there are no changes being proposed to the runtime semantics of annotations. The type hinting proposal describes a conventional use for them that will be of benefit to static type checking systems and integrated development environments, but it will be exactly that: a convention, not an enforced behaviour. The convention of treating "_" prefixed methods and other attributes as private to the implementation of a class or module is a good example of a similar approach. While some things (like pydoc and wildcard imports) will respect the convention, it's not enforced at the core language level - if a developer decides they're prepared to accept the compatibility risk, then they're free to use the "private" attribute if they choose to do so. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Sun, Dec 21, 2014 at 3:47 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
Well... The elephant in the room is that *eventually* other uses of annotations *may* be frowned upon, or may need to be marked by some decorator. But I promise that in Python 3.5 your code will not break -- it just might not be very useful to run a static checker like mypy on it. (IIRC mypy used to only typecheck code that imports the typing.py module, but this seems to have changed.) When we discussed this earlier this year, a few other uses of annotations were brought up, and some have proposed that static type annotations would need to be marked by a decorator. There is even a proposed syntax that allows multiple annotations to coexist for the same argument (a dict with fixed keys -- to me it looks pretty ugly though). I would really like to wait and see how this plays out -- the proposal I'm working on is careful not to have any effect at runtime (as long as the typing.py module can be imported and as long as the annotation expressions don't raise exceptions), and use of a static checker is entirely optional and voluntary. Perhaps the PEP should define some way to tell the type checker not to follow certain imports? That would be useful in case you have a program that tries to follow the annotation conventions for static checking but imports some library that uses annotations for a different purpose. You can probably do this already for mypy by writing a stub module. -- --Guido van Rossum (python.org/~guido)

On 22 December 2014 at 14:05, Guido van Rossum <guido@python.org> wrote:
Agreed - I see this as a good, incremental evolution from the increased formality and flexibility in the type system that was introduced with ABCs, and I think that's been successful in letting folks largely not need to worry about them unless they actually need to deal with the problems they solve (like figuring out whether a container is a Sequence or Mapping). Ideally we'll get the same result here - folks that have problems where type annotations can help will benefit, while those that don't need them won't need to worry about them.
For inline use, it may be worth defining a module level equivalent to pylint's "#pylint: skip-file" comments (rather than having each static checker come up with its own way of spelling that). Aside from that, it may also be worth recommending that static type checkers provide clear ways to control the scope of scanning (e.g. by allowing particular directories and files to be excluded, or limit scans to particular directories. In both cases, the PEP would likely need to define the implied annotations to be assumed for excluded modules. I realised in trying to write this email that I don't currently understand the consequences of not having annotation data available for a module in terms of the ripple effects that may have on what scanners can check - from the end user perspective, I believe that's something I'd need to know, even though I wouldn't necessarily need to know *why* those were the default assumptions for unannotated operations. (I'm curious about the latter from a language *design* perspective, but I think I'd be able to use the feature effectively just by knowing the practical consequences without necessarily understanding the theory) Regards, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On Sun, Dec 21, 2014 at 9:35 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
That's what "Any" is for. If an object's type is "Any", then every operation on it (including getattr) also return "Any", from the checker's POV. You stop the ripples with an explicitly declared type -- e.g. if you pass it to a function as an arg with a type annotation, inside that function the argument is assumed to have the declared type (but the "Any" prevents diagnostics about the call site). In mypy there is also an explicit cast operator (http://mypy.readthedocs.org/en/latest/casts.html), we may add this to the PEP (it's one of the many details left out from the Quip doc that need to be filled in for the actual PEP). The importance of "Any" cannot be overstated -- without it, you either have to declare types everywhere, or you end up with a "everything inherits from everything" situation. The invention of "Any" prevents this (and this is why is-consistent-with cannot be transitive). Read Jeremy Siek's Gradual Typing blog post for more. Also, consider the important difference between Any and object. They are both at the top of the class tree -- but object has *no* operations (well, almost none -- it has repr() and a few others), while Any supports *all* operations (in the sense of "is allowed by the type system/checker"). This places Any also at the *bottom* of the class tree, if you can call it that. (And hence it is more a graph than a tree -- but if you remove Any, what's left is a tree again.) Any's "contagiousness" is somewhat similar to NaN for floats, but it has its limits -- e.g. repr() of Any is still a string, and Any==Any is still a bool. Another similar concept exists IIUC in Objective-C, which has a kind of null object that can be called (sent a message), always returning another null result. (This has occasionally been proposed for Python's None, but I don't think it is appropriate.) But of course, Any only exists in the mind of the static checker -- at runtime, the object has a concrete type. "Any" is just the type checker's way to say "I don't know the type, and I don't care". I bet compilers with good error recovery have some similar concept internally, used to avoid a cascade of errors due to a single typo. -- --Guido van Rossum (python.org/~guido)

On Mon, Dec 22, 2014 at 8:02 AM, Guido van Rossum <guido@python.org> wrote: [...]
limits -- e.g. repr() of Any is still a string, and Any==Any is still a bool.
Why is Any==Any a bool? Comparison operators can return anything, and libraries like Numpy or SQLAlchemy take advantage of this (comparing Numpy arrays results in an array of bools, and comparing a SQLAlchemy Column to something results in a comparison expression, e.g. `query.filter(table.id == 2)`). Would type checkers be expected to reject these uses?
Why is this limited to None? Can this be extended to "if there's a default argument, then that exact object is also allowed"? That would allow things like: _MISSING = object() def getattr(key: str, default: VT=_MISSING) -> VT: ...

On Mon, Dec 22, 2014 at 3:54 AM, Petr Viktorin <encukou@gmail.com> wrote:
Never mind, I forgot about that. :-)
Well, that would require the type checker to be smarter. I don't want to go down the slippery path of defining compile-time expressions. Allowing None is easy (it's a keyword), useful and covers the common cases. -- --Guido van Rossum (python.org/~guido)

On 22 Dec 2014 17:02, "Guido van Rossum" <guido@python.org> wrote:
Also, consider the important difference between Any and object. They are
both at the top of the class tree -- but object has *no* operations (well, almost none -- it has repr() and a few others), while Any supports *all* operations (in the sense of "is allowed by the type system/checker"). Ah, this was the key piece I had missed, both from your article and Jeremy's: since Any conceptually allows all operations, with a result of Any, there's no need to separately represent "callable that permits arbitrary arguments, returning a result of unknown type". Very nice! Regards, Nick.

On Dec 21, 2014, at 23:02, Guido van Rossum <guido@python.org> wrote:
Any's "contagiousness" is somewhat similar to NaN for floats, but it has its limits -- e.g. repr() of Any is still a string, and Any==Any is still a bool. Another similar concept exists IIUC in Objective-C, which has a kind of null object that can be called (sent a message), always returning another null result. (This has occasionally been proposed for Python's None, but I don't think it is appropriate.)
I think you're mixing up two things here. ObjC does have a concept pretty close to Any, but it's not nil, it's id. This is a type that's never seen at runtime, but at compile time, in practice,* it's both a top and bottom type** (any method can be called on an object declared as id; any argument can be passed to a covariant or contravariant parameter of type id; a value declared id can be stored or passed anywhere), and subtyping is transitive except as far as id is concerned. But nil is a whole different thing. Dynamically, it's similar to Python's None, except that it responds to every method with None. Statically, every type is effectively an Optional[T], and can hold either a T (or subclass) instance or nil.*** The static type of nil might be id, but it doesn't really matter, because that never comes up.**** The dynamic type of nil is nil.***** * In reality, this relies on the fact that ObjC objects are always referenced by C pointers, and id is effectively just a typedef for void*, and C allows any pointer to be implicitly cast to and from void*. ** Except when native C types get involved. Traditional ObjC waved that away by assuming that every C type except double and long long was the same size as a pointer and you don't use those two very often; the 64-bit transition screwed that up, and the compiler no longer lets you mix up ObjC and native C types. *** Again, this is because you always reference ObjC objects as C pointers, and C pointers can always accept null values. **** Unless you explicitly use the compile-time typeof or sizeof operators on nil, which there's no good reason to do. If you do, then you'll see it's defined as (id)0. ***** Because the equivalent of the type function is a method.

On Wed, Dec 24, 2014 at 3:22 PM, Eugene Toder <eltoder@gmail.com> wrote:
Thanks, that was a simple "thinko". I meant that without Any (and perhaps barring some other pathological cases due to Python's extreme malleability) it's a DAG, but with Any it has a cycle. I've also updated the Quip doc to speak of "class graph". -- --Guido van Rossum (python.org/~guido)

On 12/21/2014 08:05 PM, Guido van Rossum wrote:
What I ended up doing for my scription program was to move the annotations outside the def, and store them in a different attribute, applied with a decorator: @Command( file=('source file', ), dir=('target directory', ), options=('extra options', MULTI, ), ) def copy(file, dir, options): pass copy.__scription__ --> {'file':..., 'dir':..., 'options':...} -- ~Ethan~

A few months ago we had a long discussion about type hinting. I've
Guido van Rossum <guido@...> writes: thought a
(I apologize in advance if some of this was covered in previous discussions). 1. Since there's the Union type, it's also natural to have the Intersection type. A class is a subclass of Intersection[t1, t2, ...] if it's a subclass of all t1, t2 etc. The are 2 main uses of the Intersection type: a) Require that an argument implements multiple interfaces: class Foo: @abstractmethod def foo(self): ... class Bar: @abstractmethod def bar(self): ... def fooItWithABar(obj: Intersection[Foo, Bar]): ... b) Write the type of an overloaded function: @overload def foo(x: str) -> str: ... @overload def foo(x: bytes) -> bytes: ... foo # type: Intersection[Callable[[str], str], Callable[[bytes], bytes]] 2. Constrained type variables (Var('Y', t1, t2, ...)) have a very unusual behavior. a) "subclasses of t1 etc. are replaced by the most-derived base class among t1 etc." This defeats the very common use of constrained type variables: have a type preserving function limited to classes inherited from a common base. E.g. say we have a function: def relocate(e: Employee) -> Employee: ... The function actually always returns an object of the same type as the argument, so we want to write a more precise type. We usually do it like this: XEmployee = Var('XEmployee', Employee) def relocate(e: XEmployee) -> XEmployee: ... This won't work with the definition from the proposal. b) Multiple constraints introduce an implicit Union. I'd argue that type variables are more commonly constrained by an Intersection rather than a Union. So it will be more useful if given this definition Y has to be compatible with all of t1, t2 etc, rather than just one of them. Alternatively, this can be always spelled out explicitly: Y1 = Var('Y1', Union[t1, t2, ...]) Y2 = Var('Y2', Intersection[t1, t2, ...]) Pragmatics: 3. The names Union and Intersection are standard terminology in type checking, but may not be familiar to many Python users. Names like AnyOf[] and AllOf[] can be more intuitive. 4. Similar to allowing None to mean type(None) it's nice to have shortcuts like: (t1, t2, ...) == Tuple[t1, t2, ...] [t1] == List[t1] {t1: t2} == Dict[t1, t2] {t1} == Set[t1] The last 3 can be Sequence[t1], Mapping[t1, t2] and collections.Set[t1] if we want to encourage the use of abstract types. 5. Using strings for forward references can be messy in case of generics: parsing of brackets etc in the string will be needed. I propose explicit forward declarations: C = Declare('C') class C(Generic[X]): def foo(self, other: C[X]): ... def bar(self, other: C[Y]): ... 6. On the other hand, using strings for unconstrained type variables is quite handy, and doesn't share the same problem: def head(xs: List['T']) -> 'T': ... Regards, Eugene

On Wed, Dec 24, 2014 at 4:50 PM, Eugene Toder <eltoder@gmail.com> wrote:
No problem. :-) I apologize for reformatting the text I am quoting from you, it looked as if it was sent through two different line clipping functions.
Yes, we even have an issue to track this proposal. I don't recall who suggested it first. I don't know if it poses any problems to the static checked (though I doubt it). https://github.com/ambv/typehinting/issues/18
The static checker can figure that out for itself, but that doesn't mean we necessarily need a way to spell it.
I just copied this from mypy (where it is called typevar()). I guess in that example one would use an *unconstrained* type variable. The use case for the behavior I described is AnyStr -- if I have a function like this I don't want the type checker to assume the more precise type: def space_for(s: AnyStr) -> AnyStr: if isinstance(s, str): return ' ' else: return b' ' If someone defined a class MyStr(str), we don't want the type checker to think that space_for(MyStr(...)) returns a MyStr instance, and it would be impossible for the function to even create an instance of the proper subclass of str (it can get the class object, but it can't know the constructor signature). For strings, functions like this (which return some new string of the same type as the argument, constrained to either str or bytes) are certainly common. And for your Employee example it would also seem problematic for the function to know how to construct an instance of the proper (dynamically known) subclass. b) Multiple constraints introduce an implicit Union. I'd argue that type
Well, maybe. At this point you'd have to point us to a large body of evidence -- mypy has done well so far with its current definition of typevar(). OTOH one of the ideas on the table is to add keyword options to Var(), which might make it possible to have type variables with different semantics. There are other use cases, some of which are discussed in the tracker: https://github.com/ambv/typehinting/issues/18
I strongly disagree with this. Python's predecessor, ABC, used a number of non-standard terms for common programming language concepts, for similar reasons. But the net effect was just that it looked weird to anyone familiar with other languages, and for the users who were a completely blank slate, well, "HOW-TO" was just as much jargon that they had to learn as "procedure". Also, the Python users who will most likely need to learn about this stuff are most likely library developers.
This was proposed as the primary notation during the previous round of discussions here. You are right that if we propose to "fix up" type annotations that appear together with a default value we should also be able in principle to change these shortcuts into the proper generic type objects. Yet I am hesitant to adopt the suggestion -- people may already be using e.g. dictionaries as annotations for some other purpose, and there is the question you bring up whether we should promote these to concrete or abstract collection types. Also, I should note that, while I mentioned it as a possibility, I am hesitant to endorse the shortcut of "arg: t1 = None" as a shorthand for "arg: Union[t1, None] = None" because it's unclear whether runtime introspection of the __annotations__ object should return t1 or the inferred Union object. (The unspoken requirement here is that there will be no changes to CPython's handling of annotations -- the typing.py module will be all that is needed, and it can be backported to older Python versions.)
Agreed this is an area that needs more thought. In mypy you can actually write the entire annotation in string quotes -- mypy has to be able to parse type expressions anyway (in fact it has to be able to parse all of Python :-). I do think that the example you present feels rather obscure.
Yeah, it does look quite handy, if the ambiguity with forward references can be resolved. Also it's no big deal to have to declare a type variable -- you can reuse them for all subsequent function definitions, and you usually don't need more than two or three. -- --Guido van Rossum (python.org/~guido)

On Wed, Dec 24, 2014 at 08:16:52PM -0800, Guido van Rossum wrote:
I presume that runtime name binding will be allowed, e.g. from typing import Union as AnyOf def spam(x: AnyOf[int, float])->str: ... but not encouraged. (It's not that hard to learn a few standard names like Union.) So the above would work at runtime, but at compile time, it will depend on the specific linter or type checker: it will be a "quality of implementation" issue, with simple tools possibly not being able to recognise AnyOf as being the same as Union. Is this what you have in mind? -- Steven

On Thu, Dec 25, 2014 at 6:43 AM, Steven D'Aprano <steve@pearwood.info> wrote:
I would not recommend that to anyone -- I find that use of "import ... as" is often an anti-pattern or a code smell, and in this case it would seem outright silly to fight the standard library's terminology (assuming typing.py defines Union). I don't know if mypy supports this (it's easy to try it for yourself though) but I do know it follows simple global assignments, known as type aliases, e.g. "foo = Iterator[int]". -- --Guido van Rossum (python.org/~guido)

On Wed, Dec 24, 2014 at 11:16 PM, Guido van Rossum <guido@python.org> wrote:
I thought more about this, and I think I understand what you are after. The syntax confused me somewhat. I also recalled that type variables may need lower bounds in addition to upper bounds. Is AnyStr the main use case of this feature? If that's the case, there are other ways to achieve the same effect with more general features. First, some motivational examples. Say we have a class for an immutable list: class ImmutableList(Generic[X]): def prepend(self, item: X) -> ImmutableList[X]: ... The type of prepend is actually too restrictive: we should be able to add items that are superclass of X and get a list of that more general type: Y = Var('Y') >= X # must be a superclass of X class ImmutableList(Generic[X]): def prepend(self, item: Y) -> ImmutableList[Y]: ... Alternative syntax for Y, based on the Java keyword: Y = Var('Y').super(X) This will be handy to give better types to some methods of tuple and frozenset. Next, let's try to write a type for copy.copy. There are many details that can be changed, but here's a sketch. Naturally, it should be def copy(obj: X) -> X: ... But copy doesn't work for all types, so there must be some constraints on X: X = Var('X') X <= Copyable[X] # must be a subclass of Copyable[X] (Alternative syntax: X.extends(Copyable[X]); this also shows why constraints are not listed in the constructor.) Copyable is a protocol: @protocol class Copyable(Generic[X]): def __copy__(self: X) -> X: ... And for some built-in types: Copyable.register(int) Copyable.register(str) ... This approach can be used to type functions that special-case built-in types, and rely on some known methods for everything else. In my example with XEmployee the function could either return its argument, or make a copy -- the Employee class can require all its subclasses to implement some copying protocol (e.g. a copy() method). In fact, since the longest() function from your document always returns one of its arguments, its type can be written as: X = Var('X') <= Union[str, bytes] def longest(a: X, b: X) -> X: ... that is, it doesn't need to restrict the return type to str or bytes :-) Finally, the feature from your document: AnyStr = Var('AnyStr').restrictTo(str, bytes) # the only possible values However, this can be achieved by adding more useful features to protocols: # explicit_protocol is a non-structural protocol: only explicitly registered # types are considered conforming. This is very close to type classes. # Alternatively, protocols with no methods can always be explicit. @explicit_protocol class StringLike(Generic[X]): # This type can be referenced like a class-level attribute. # The name "type" is not special in any way. type = X StringLike.register(str) StringLike.register(bytes) AnyStr = Var('AnyStr') AnyStr <= StringLike[AnyStr] AnyStrRet = StringLike[AnyStr].type def space_for(x: AnyStr) -> AnyStrRet: ... There are many details that can be tweaked, but it is quite powerful, and solves the simpler problem as well. that use union type seem to use t1|t2 syntax for it. AFAIU this syntax was rejected to avoid changes in CPython. This is a shame, because it is widespread and reads really well: def foo(x: Some|Another): ... Also, Type|None is so short and clear that there's no need for the special Optional[] shorthand. Given we won't use |, I think def foo(x: AnyOf[Some, Another]): ... reads better than def foo(x: Union[Some, Another]): ... but this may be getting into the bikeshedding territory :-) these annotations to proper generic types. This should be done internally in the type checker. If we want other tools to understand this syntax, we can expose functions typing.isTypeAnnotation(obj) and typing.canonicalTypeAnnotation(obj). With this approach, I don't believe this use of lists and dicts adds any more problems for the existing uses of annotations. The decision of whether to use concrete or abstract types is likely not a hard one. Given my experience, I'd use concrete types, because they are so common. But this does depend on the bigger context of how annotations are expected to be used.
class Set(Generic[X]): def union(self, other: Set[X]) -> Set[X]: ...
While on the subject: what are the scoping rules for type variables? I hope they are lexically scoped: the names used in the enclosing class or function are considered bound to those values, rather than fresh variables that shadow them. I used this fact in the examples above. E.g. union() above accepts only the sets with the same elements, not with any elements, and in def foo(x: X) -> X: def bar(y: X) -> X: return y return bar(x) X in bar() must be the same type as in foo(). Eugene

On Thu, Dec 25, 2014 at 1:49 PM, Eugene Toder <eltoder@gmail.com> wrote:
mypy solves that using @overload in a stub file. That's often more precise.
Hm, looks like the case for Intersection is still pretty weak. Anyway, we can always add stuff later. But whatever we add in 3.5 we cannot easily take back.
Yes, that's the issue I meant.
I don't know if this is the main use case (we should ask Jukka when he's back from vacation). I'm hesitant to propose more general features without at least one implementation. Perhaps you could try to see how easy those more general features would be implementable in mypy?
Neither syntax is acceptable to me, but let's assume we can do this with some other syntax. Your example still feels like it was carefully constructed to prove your point -- it would make sense in a language where everything is type-checked and types are the basis for everything, and users are eager to push the type system to its limits. But I'm carefully trying to avoid moving Python in that direction.
This will be handy to give better types to some methods of tuple and frozenset.
I assume you're talking about the case where e.g. I have a frozenset of Managers and I use '+' to add an Employee; we then know that the result is a frozenset of Employees. But if we assume covariance, that frozenset of Managers is also a frozenset of Employees, so (assuming we have a way to indicate covariance) the type-checker should be able to figure this out. Or are you perhaps trying to come up with a way to spell covariance? (The issue #2 above has tons of discussion about that, although I don't think it comes to a clear conclusion.)
Next, let's try to write a type for copy.copy.
Eek. That sounds like a bad idea -- copy.copy() uses introspection and I don't think there's much hope to be able to spell its type. (Also I usually consider the use of copy.copy() a code smell. Perhaps there's a connection. :-)
Sorry, I'm not sold on this. I also worry that the register() calls are hard to track for a type checker -- but that's minor (I actually don't know if this would be a problem for mypy). I just don't see the point in trying to create a type system powerful enough to describe copy.copy().
That sounds like an artificial requirement on the implementation designed to help the type checker. I'm inclined to draw the line well before that point. (Otherwise Raymond Hettinger would throw a fit. :-)
In fact, since the longest() function from your document always returns one of its arguments,
But that was just the shortest way to write such an example. The realistic examples (e.g. URL parsing or construction) aren't that simple.
Now you're just wasting my time. :-)
I'm afraid you've lost me. But (as you may have noticed) I'm not really the one you should be convincing -- if you can convince Jukka to (let you) add something like this to mypy you may have a better case. Even so, I want to limit the complexity of what we add to Python 3.5 -- TBH basic generic types are already pushing the limits. I would much rather be asked to add more stuff to 3.6 than to find out that we've added so much to 3.5 that people can't follow along. Peter Norvig mentioned that the subtleties of co/contra-variance of generic types in Java were too complex for his daughter, and also reminded me that Josh Bloch has said somewhere that he believed they made it too complex.
Yes, but we're not going to change it, and it will be fine.
Right. :-)
But I can see a serious downside as well. There will likely be multiple tools that have to be able to read the type hinting annotations, e.g. IDEs may want to use the type hints (possibly from stub files) for code completion purposes. Also someone might want to write a decorator that extracts the annotations and asserts that arguments match at run time. The more handy shorthands we invent, the more complex all such tools will have to be.
That's how I'm leaning as well.
You may just have killed the idea. Let's keep it simpler.
I know. :-)
How complex does it really have to be? Perhaps Name[Name, Name, ...] is the only form (besides a plain Name) that we really need? Anything more complex can probably be reduced using type aliases. Then again my earlier argument is clearly for keeping things simple, and perhaps an explicit forward declaration is simpler. The run-time representation would still be somewhat problematic. I'll try to remember to report back once I have tried to implement this.
I don't think it's quite a toss-up. A type variable is a special feature. But a forward reference is not much different from a backward reference -- you could easily imagine a language (e.g. C++ :-) where forward references don't require special syntax. The rule that 'X' means the same as X but is evaluated later is pretty simple, whereas the rule the 'X' introduces a type variable is pretty complex. So even if we *didn't* use string quotes for forward references I still wouldn't want to use that syntax for type variables.
Why don't you install mypy and check for yourself? (I expect it's as you desire, but while I have mypy installed, I'm on vacation and my family is asking for my attention.) -- --Guido van Rossum (python.org/~guido)

On Thu, Dec 25, 2014 at 10:41 PM, Guido van Rossum <guido@python.org> wrote:
The real Set cannot be covariant, though, because it supports mutation.
def copy(obj): if isinstance(obj, int): return obj if isinstance(obj, list): return list(obj) ... return obj.__copy__() This does not seem very hard to type. There are much simpler examples, though: a) Keys of Dict and elements of Set must be Hashable, b) To use list.index() list elements must be Comparable, c) Arguments to min() and max() must be Ordered, d) Arguments to sum() must be Addable. So it's not uncommon to have generic functions that need restrictions on type variables. produce the value. programmers to understand, and made all generics in Dart covariant. This was also the case in Beta, whose authors denounced invariance and contravariance, as coming from people "with type-checking background" :-) limited -- there are only as many literals in Python.
Eugene

On Fri, Dec 26, 2014 at 12:00 PM, Eugene Toder <eltoder@gmail.com> wrote:
Well, it copies most class instances by just copying the __dict__. And it recognizes a bunch of other protocols (__copy__ and most pickling interfaces).
I think you are still trying to design a type system that can express all constraints exactly. In practice I doubt if any of the examples you mention here will help catch many bugs in actual Python code; a type checker that is blissfully unaware of these requirements will still be tremendously useful. (I guess this is the point of gradual typing.)
Yeah, but my counter is that Python users today don't write classes like that, and I don't want them to have to change their habits.
I'm not sure I understand why you think that is funny. I think they all have a point.
I think there are at least three separate use cases (note that none are part of the proposal -- the proposal just enables a single notation to be used for all three): (1) Full type checkers like mypy. These have to parse everything without ever running it, so they cannot use the typing module's primitives. They may also have to parse stuff in comments (there are several places where mypy needs a little help and the best place to put it is often in a #type: comment), which rules out Python's ast module. (2) Things that use runtime introspection, e.g. decorators that try to enforce run time correctness. These can use the typing module's primitives. I wish we could just always have (generic) type objects in the annotations, so they could just look up the annotation and then use isinstance(), but I fear that forward refs will spoil that simplicity anyway. (3) IDEs. These typically need to be able to parse code that contains errors. So they end up having their own, more forgiving parser. That's enough distinct cases to make me want to compromise towards a slightly more verbose syntax that requires less special handling. (I am also compromising because I don't want to change CPython's parser and I want to be able to backport typing.py to Python 3.4 and perhaps even 3.3.) -- --Guido van Rossum (python.org/~guido)

On Dec 25, 2014, at 22:49, Eugene Toder <eltoder@gmail.com> wrote:
I'm not sure this problem exists. The builtin set (and therefore the undocumented MyPy/typing TypeAlias Set) had a union method, but its signature is not that restrictive. It takes 1 or more arbitrary iterables of any element type, and of course it returns a set whose element type is the union of the element types of self and those. And the same is true in general for all of the builtin abstract and concrete types. So, the fact that Guido's/Jukka's proposal doesn't make it easy to define types that are more restrictive than you'd normally want to use in Python doesn't seem to be a problem. Sure, if you wanted to define more restricted C++/Swift/Haskell style collections for Python your want to be able to type them as easily as in those languages... But why do you want to define those collections? The _opposite_ problem--that it's hard to define the _actual_ type of set.union or similarly highly parameterized types--may be more serious, but Guido acknowledged that one long ago, and I think he's right that it seems like the kind of thing that could be added later. (In the initial MyPy for 3.5 you can always define a one-argument version set[X].union(Iterable[Y])->set[Union[X, Y] and a generic multi-argument overload that, say, treats all the Iterable[Any] and returns Set[Any]. If we turn out to need parameter schemas for varargs in real programs, that can surely be added in 3.6 as easily as it could now. And hopefully it won't be needed. You need some kind of complete language to write such schemas in, and C++11 is a nice warning of what it looks like to try to do that declaratively in a non-declarative language.)

On Fri, Dec 26, 2014 at 10:26 AM, Andrew Barnert <abarnert@yahoo.com> wrote: than with a lower bound. So we don't need lower bounds on type variables for collection methods, and maybe at all.
The _opposite_ problem--that it's hard to define the _actual_ type of set.union or similarly highly parameterized types--may be more serious,
Why is it hard? Isn't the actual type just: def union(self, *others: Iterable[Y]) -> Set[Union[X, Y]] where typing of vararg is similar to Java -- all elements must conform to the single type annotation. Also note that I posted set.union method as an example that needs a forward reference to a generic class. I was arguing that if we use strings for forward references, we'll eventually have complicated expressions in those strings, not just class names: class Set(Generic[X]): # Note that the name "Set" is not yet available, so we have to use # a forward reference. This puts the whole return type inside a string. def union(self, *others: Iterable[Y]) -> "Set[Union[X, Y]]": ... Your type for set.union seems to prove the point even better than what I used. Eugene

On Dec 26, 2014, at 18:59, Eugene Toder <eltoder@gmail.com> wrote:
I'm not sure you can just skip over that last point without addressing it. In this case, given iterables of element types Y1, Y2, ..., Yn, you can say that they're all type Iterable[Union[Y1, Y2, ..., Yn]]. I _think_ an inference engine can find that Union type pretty easily, and I _think_ that at least for collection methods there won't be any harder problems--but I wouldn't just assume either of those without looking carefully. And it certainly isn't true when we go past collection methods--clearly map and zip can't be handled this way.
Another way to write this, assuming that Set[X][Y] means Set[Y] or that there's some syntax to get from Set[X] to Set, would be to use a typeof(self) operator. Or a special magic __class_being_defined__ constant instead of an operator, or the normal type function with a slightly different meaning at compile time than runtime, or probably other ways to bikeshed it. The point is, at least this example only really needs the type of self, not an arbitrary forward declaration or an expression that has to be crammed into a string. Are there any good examples where that isn't true? Also, should Set.union be contravariant in the generic type Set, or is it always going to return a Set[Something]? The two options there could both easily be handled by type expressions, or maybe with explicit forward declarations, but with implicit forward declaration via string? I know Guido doesn't want to start allowing arbitrary expressions, but a compile-time typeof operator is a pretty simple special case; even pre-ISO C++ had that.

On Fri, Dec 26, 2014 at 2:15 PM, Andrew Barnert <abarnert@yahoo.com> wrote: the type of the expression [Y1, Y2, ...] is List[Union[Y1, Y2, ...]]. Alternatively, the function call is typed as if the argument was replicated the number of times equal to the number of actual arguments. Both ways should give the same result, and are already supported in the type checker. This seems intuitive, matches your analysis above, and implemented in at least C#, Java and Scala, so there's a good evidence that this is quite usable. the arguments types. You can go further, and say that str.format() type needs to parse the format string to determine the number and the types of what goes into varargs. I think the simple "all of the same type" rule is good enough to type the majority of uses of varargs, except for argument forwarding into a call. At least it's better than nothing.
Eugene
participants (11)
-
Andrew Barnert
-
Andrew Svetlov
-
Brett Cannon
-
Dennis Brakhane
-
Ethan Furman
-
Eugene Toder
-
Guido van Rossum
-
Jim Baker
-
Nick Coghlan
-
Petr Viktorin
-
Steven D'Aprano