Proposal to extend PEP 484 (gradual typing) to support Python 2.7

At Dropbox we're trying to be good citizens and we're working towards introducing gradual typing (PEP 484) into our Python code bases (several million lines of code). However, that code base is mostly still Python 2.7 and we believe that we should introduce gradual typing first and start working on conversion to Python 3 second (since having static types in the code can help a big refactoring like that). Since Python 2 doesn't support function annotations we've had to look for alternatives. We considered stub files, a magic codec, docstrings, and additional `# type:` comments. In the end we decided that `# type:` comments are the most robust approach. We've experimented a fair amount with this and we have a proposal for a standard. The proposal is very simple. Consider the following function with Python 3 annotations: def embezzle(self, account: str, funds: int = 1000000, *fake_receipts: str) -> None: """Embezzle funds from account using fake receipts.""" <code goes here> An equivalent way to write this in Python 2 is the following: def embezzle(self, account, funds=1000000, *fake_receipts): # type: (str, int, *str) -> None """Embezzle funds from account using fake receipts.""" <code goes here> There are a few details to discuss: - Every argument must be accounted for, except 'self' (for instance methods) or 'cls' (for class methods). Also the return type is mandatory. If in Python 3 you would omit some argument or the return type, the Python 2 notation should use 'Any'. - If you're using names defined in the typing module, you must still import them! (There's a backport on PyPI.) - For `*args` and `**kwds`, put 1 or 2 starts in front of the corresponding type annotation. As with Python 3 annotations, the annotation here denotes the type of the individual argument values, not of the tuple/dict that you receive as the special argument value 'args' or 'kwds'. - The entire annotation must be one line. (However, see https://github.com/JukkaL/mypy/issues/1102.) We would like to propose this as a standard (either to be added to PEP 484 or as a new PEP) rather than making it a "proprietary" extension to mypy only, so that others in a similar situation can also benefit. A brief discussion of the considered alternatives: - Stub files: this would complicate the analysis in mypy quite a bit, because it would have to parse both the .py file and the .pyi file and somehow combine the information gathered from both, and for each function it would have to use the types from the stub file to type-check the body of the function in the .py file. This would require a lot of additional plumbing. And if we were using Python 3 we would want to use in-line annotations anyway. - A magic codec was implemented over a year ago ( https://github.com/JukkaL/mypy/tree/master/mypy/codec) but after using it for a bit we didn't like it much. It slows down imports, it requires a `# coding: mypy` declaration, it would conflict with pyxl ( https://github.com/dropbox/pyxl), things go horribly wrong when the codec isn't installed and registered, other tools would be confused by the Python 3 syntax in Python 2 source code, and because of the way the codec was implemented the Python interpreter would occasionally spit out confusing error messages showing the codec's output (which is pretty bare-bones). - While there are existing conventions for specifying types in docstrings, we haven't been using any of these conventions (at least not consistently, nor at an appreciable scale), and they are much more verbose if all you want is adding argument annotations. We're working on a tool that automatically adds type annotations[1], and such a tool would be complicated by the need to integrate the generated annotations into existing docstrings (which, in our code base, unfortunately are wildly incongruous in their conventions). - Finally, the proposed comment syntax is easy to mechanically translate into standard Python 3 function annotations once we're ready to let go of Python 2.7. __________ [1] I have a prototype of such a tool, i mplemented as a 2to3 fixer. It's a bit over 200 lines. It's not very interesting yet, since it sets the types of nearly all arguments to 'Any'. We're considering building a much more advanced version that tries to guess much better argument types using some form of whole-program analysis. I've heard that Facebook's Hack project got a lot of mileage out of such a tool. I don't yet know how to write it yet -- possibly we could use a variant of mypy's type inference engine, or alternatively we might be able to use something like Jedi ( https://github.com/davidhalter/jedi). -- --Guido van Rossum (python.org/~guido)

On Fri, Jan 08, 2016 at 03:04:58PM -0800, Guido van Rossum wrote: [...]
[...]
I don't understand this paragraph. Doesn't mypy (and any other type checker) have to support stub files? I thought that stub files are needed for extension files, among other things. So I would have expected that any Python 2 type checker would have to support stub files as well, regardless of whether inline #type comments are introduced or not. Will Python 3 type checkers be expected to support #type comments as well as annotations and stub files? -- Steve

On 9 January 2016 at 12:08, Steven D'Aprano <steve@pearwood.info> wrote:
Stub files are easy to use if you're using them *instead of* the original source file (e.g. annotating extension modules, or typeshed annotations for the standard library). Checking a stub file for consistency against the published API of the corresponding module also seems like it would be straightforward (while using both a stub file *and* inline annotations for the same API seems like it would be a bad idea, it's at least necessary to check that the *shape* of the API matches, even if there's no type information). However, if I'm understanding correctly, the problem Guido is talking about here is a different one: analysing a function *implementation* to ensure it is consistent with its own published API. That's relatively straightforward with inline annotations (whether function annotation based, comment based, or docstring based), but trickier if you have to pause the analysis, go look for the right stub file, load it, determine the expected public API, and then resume the analysis of the original function. The other downside of the stub file approach is the same reason it's not the preferred approach in Python 3: you can't see the annotations yourself when you're working on the function. Folks working mostly on solo and small team projects may not see the appeal of that, but when doing maintenance on large unfamiliar code bases, the improved local reasoning those kinds of inline notes help support can be very helpful.
Will Python 3 type checkers be expected to support #type comments as well as annotations and stub files?
#type comment support is already required for variables and attributes: https://www.python.org/dev/peps/pep-0484/#type-comments That requirement for type checkers to support comment based type hints would remain, even if we were to later add native syntactic support for variable and attribute typing. I read Guido's proposal here as offering something similar for function annotations, only going in the other direction: providing a variant spelling for function type hinting that can be used in single source Python 2/3 code bases that can't use function annotations. I don't have a strong opinion on the specifics, but am +1 on the general idea - I think the approach Dropbox are pursuing of adopting static type analysis first, and then migrating to Python 3 (or at least single source Python 2/3 support) second is going to prove to be a popular one, as it allows you to detect a lot of potential migration issues without necessarily having to be able to exercise those code paths in a test running under Python 3. The 3 kinds of annotation would then have 3 clear function level use cases: stub files: annotating third party libraries (e.g. for typeshed) #type comments: annotating single source Python 2/3 code function annotations: annotating Python 3 code Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

2016-01-09 0:04 GMT+01:00, Guido van Rossum <guido@python.org>:
Could not something like this -> def embezzle(self, account, funds=1000000, *fake_receipts): # def embezzle(self, account: str, funds: int = 1000000, *fake_receipts: str) -> None: """Embezzle funds from account using fake receipts.""" <code goes here> make 1. transition from python2 to python3 more simple? 2. python3 checkers more easily changeable to understand new python2 standard? 3. simpler impact to documentation (means also simpler knowledbase to be learn) about annotations?

+1, I would really like to try out type annotation support in Jython, given the potential for tying in with Java as a source of type annotations (basically the equivalent of stubs for free). I'm planning on sprinting on Jython 3 at PyCon, but let's face it, that's going to take a while to really finish. re the two approaches, both are workable with Jython: * lib2to3 is something we should support in Jython 2.7. There are a couple of data files that we don't support in the tests (too large of a method for Java bytecode in infinite_recursion.py, not terribly interesting), plus a few other tests that should work. Therefore lib2to3 should be in the next release (2.7.1). * Jedi now works with the last commit to Jython 2.7 trunk, passing whatever it means to run random tests using its sith script against its source. (The sith test does not pass with either CPython or Jython's stdlib, starting with bad_coding.py.) - Jim On Sat, Jan 9, 2016 at 9:09 AM, Eric Fahlgren <ericfahlgren@gmail.com> wrote:

Unless there's a huge outcry I'm going to add this as an informational section to PEP 484. -- --Guido van Rossum (python.org/~guido)

Done: https://hg.python.org/peps/rev/06f8470390c2 (I'm happy to change or move this if there *is* a serious concern -- but I figured if there isn't I might as well get it over with. On Mon, Jan 11, 2016 at 9:27 AM, Guido van Rossum <guido@python.org> wrote:
-- --Guido van Rossum (python.org/~guido)

On Sat, Jan 9, 2016 at 1:54 AM, Pavol Lisy <pavol.lisy@gmail.com> wrote:
There would still have to be some marker like "# type:" for the type checker to recognize -- I'm sure that plain comments with alternate 'def' statements are pretty common and we really don't want the type checker to be confused by those. I don't like that the form you propose has so much repetition -- the design of Python 3 annotations intentionally is the least redundant possible, and my (really Jukka's) proposal tries to keep that property. Modifying type checkers to support this syntax is easy (Jukka already did it for mypy). Note that type checkers already have to parse the source code without the help of Python's ast module, because there are other things in comments: PEP 484 specifies variable annotations and a few forms of `# type: ignore` comments. Regarding the idea of a decorator, this was discussed and rejected for the original PEP 484 proposal as well. The problem is similar to that with your 'def' proposal: too verbose. Also a decorator is more expensive (we're envisioning adding many thousands of decorators, and it would weigh down program startup). We don't envision needing to introspect __annotations__ at run time. (Also, we already use decorators quite heavily -- introducing a @typehint decorator would make the code less readable due to excessive stacking of decorators.) Our needs at Dropbox are several: first, we want to add annotations to the code so that new engineers can learn their way around the code quicker and refactoring will be easier; second, we want to automatically check conformance to the annotations as part of our code review and continuous integration processes (this is where mypy comes in); third, once we have annotated enough of the code we want to start converting it to Python 3 with as much automation is feasible. The latter part is as yet unproven, but there's got to be a better way than manually checking the output of 2to3 (whose main weakness is that it does not know the types of variables). We see many benefits of annotations and automatically checking them using mypy -- but we don't want the them to affect the runtime at all. -- --Guido van Rossum (python.org/~guido)

On Sat, 9 Jan 2016 at 11:31 Guido van Rossum <guido@python.org> wrote:
To help answer the question about whether this could help with porting code to Python 3, the answer is "yes"; it's not essential but definitely would be helpful. Between Modernize, pylint, `python2.7 -3`, and `python3 -bb` you cover almost all of the issues that can arise in moving to Python 3. But notice that half of those tools are running your code under an interpreter with a certain flag flipped, which means run-time checks that require excellent test coverage. With type annotations you can do offline, static checking which is less reliant on your tests covering all corner cases. Depending on how the tools choose to handle representing str/unicode in Python 2/3 code (i.e., say that if you specify the type as 'str' it's an error and anything that is 'unicode' is considered the 'str' type in Python 3?), I don't see why mypy can't have a 2/3 compatibility mode that warns against uses of, e.g. the bytes type that don't directly translate between Python 2 and 3 like indexing. That kind of static warning would definitely be beneficial to anyone moving their code over as they wouldn't need to rely on e.g., `python3 -bb ` and their tests to catch that common issue with bytes and indexing. There is also the benefit of gradual porting with this kind of offline checking. Since you can slowly add more type information, you can slowly catch more issues in your code. Relying on `python3 -bb`, though, requires you have ported all of your code over first before running it under Python 3 to catch some issues.

On 09.01.2016 00:04, Guido van Rossum wrote:
By using comments, the annotations would not be available at runtime via an .__annotations__ attribute and every tool would have to implement a parser for extracting them. Wouldn't it be better and more in line with standard Python syntax to use decorators to define them ? @typehint("(str, int, *str) -> None") def embezzle(self, account, funds=1000000, *fake_receipts): """Embezzle funds from account using fake receipts.""" <code goes here> This would work in Python 2 as well and could (optionally) add an .__annotations__ attribute to the function/method, automatically create a type annotations file upon import, etc. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Jan 09 2016)
::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ http://www.malemburg.com/

On Sat, Jan 9, 2016 at 4:48 AM M.-A. Lemburg <mal@egenix.com> wrote:
The goal of the # type: comments as described is to have this information for offline analysis of code, not to make it available at run time. Yes, a decorator syntax could be adopted if anyone needs that. I don't expect anyone does. Decorators and attributes would add run time cpu and memory overhead whether the information was going to be used at runtime or not (likely not; nobody is likely to *deploy* code that looks at __annotations__). -gps

On Jan 11, 2016, at 10:42, Gregory P. Smith <greg@krypto.org> wrote:
The goal of the # type: comments as described is to have this information for offline analysis of code, not to make it available at run time. Yes, a decorator syntax could be adopted if anyone needs that. I don't expect anyone does. Decorators and attributes would add run time cpu and memory overhead whether the information was going to be used at runtime or not (likely not; nobody is likely to deploy code that looks at __annotations__).
These same arguments were made against PEP 484 in the first place, and (I think rightly) dismissed. 3.x code with annotations incurs a memory overhead, even though most runtime code is never going to use them. That was considered to be acceptable. So why isn't it acceptable for the same code before it's ported to 3.x? Or, conversely, if it isn't acceptable in 2.x, why isn't it a serious blocking regression that, once the port is completed and you're running under 3.x, you're now wasting memory for those useless annotations? Meanwhile, when _are_ annotations useful at runtime? Mostly during the kind of debugging that you'll be doing during something like a port from 2.x to 3.x. While you're still, by necessity, running under 2.x. If they're not useful there, it's hard to imagine why they'd be useful after the port is done, when you're deploying your 3.x code. So it seems like using decorators (or backporting the syntax, as Google has done) has better be acceptable for 2.7, or the PEP 484 design has a serious problem, and in a few months we're going to see Dropbox and Google and everyone else demanding a way to use type hinting without wasting memory on annotations are runtime in 3.x.

On Mon, Jan 11, 2016 at 12:22 PM, Andrew Barnert <abarnert@yahoo.com> wrote:
The way I recall it the argument was made against using decorators for PEP 484 and we rightly decided not to use decorators.
I'm not objecting to the memory overhead of using decorators, but to the execution time (the extra function call). And the scope for the proposal is much smaller -- while PEP 484 is the first step on a long road towards integrating gradual (i.e. OPTIONAL) typing into Python, the proposal on the table today is only meant for annotating Python 2.7 code so we can get rid of it more quickly.
I'm not sure how to respond to this -- I disagree with your prediction but I don't think either of us really has any hard data from experience yet. I am however going to be building the kind of experience that might eventually be used to decide this, over the next few years. The first step is going to introduce annotations into Python 2.7 code, and I know my internal customers well enough to know that convincing them that we should use decorators for annotations would be a much bigger battle than putting annotations in comments. Since I have many other battles to fight I would like this one to be as short as possible. So it seems like using decorators (or backporting the syntax, as Google has
Again, I disagree with your assessment but it's difficult to prove anything without hard data. One possible argument may be that Python 3 offers a large package of combined run-time advantages, with some cost that's hard to separate. However, for Python 2.7 there's either a run-time cost or there's no run-time cost -- there's no run-time benefit. And I don't want to have to calculate how many extra machines we'll need to provision in order to make up for the run-time cost. -- --Guido van Rossum (python.org/~guido)

On 11.01.2016 22:38, Guido van Rossum wrote:
To clarify: My suggestion to use a simple decorator with essentially the same syntax as proposed for the "# type: comments " was meant as *additional* allowed syntax, not necessarily as the only one to standardize. I'm a bit worried that by standardizing on using comments for these annotations only, we'll end up having people not use the type annotations because they simply don't like the style of having function bodies begin with comments instead of doc-strings. I certainly wouldn't want to clutter up my code like that. Tools parsing Python 2 source code may also have a problem with this (e.g. not recognize the doc-string anymore). This simply reads better, IMO: @typehint("(str, int, *str) -> None") def embezzle(self, account, funds=1000000, *fake_receipts): """Embezzle funds from account using fake receipts.""" <code goes here> and it has the advantage of allowing to have the decorator do additional things such as taking the annotations and writing out a type annotations file for Python 3 and other tools to use. We could also use a variant of the two proposals and additionally allow this syntax: #@typehint("(str, int, *str) -> None") def embezzle(self, account, funds=1000000, *fake_receipts): """Embezzle funds from account using fake receipts.""" <code goes here> to avoid memory and runtime overhead, if that's a problem. Moving from one to the other would then be a simple search&replace over the source code. Or we could have -O remove all those typehint decorator calls from the byte code to a similar effect. Code written for Python 2 & 3 will have to stick to the proposed syntax for quite a while, so we should try to find something that doesn't introduce a new syntax variant of how to specify additional function/method properties, because people are inevitably going to start using the same scheme for all sorts of other crazy stuff and this would make Python code look closer to Java than necessary, IMO: @public_interface @rest_accessible @map_exceptions_to_http_codes def embezzle(self, account, funds=1000000, *fake_receipts): # type: (str, int, *str) -> None # raises: ValueError, TypeError # optimize: jit, inline_globals # tracebacks: hide_locals # reviewed_by: xyz, abc """Embezzle funds from account using fake receipts.""" <code goes here> -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Jan 11 2016)
::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ http://www.malemburg.com/

On 1/11/2016 5:38 PM, M.-A. Lemburg wrote:
Code with type comments will run on any standard 2.7 interpreter. Code with an @typehint decorator will have to either run on a nonstandard interpreter or import 'typehint' from somewhere other than the stdlib or define 'typehint' at the top of the file or have the decorators stripped out before public distribution. To me, these options come close to making the decorator inappropriate as a core dev recommendation. However, the new section of the PEP could have a short paragraph that mentions @typehint(typestring) as a possible alternative (with the options given above) and recommend that if a decorator is used, then the name should be 'typehint' (or something else agreed on) and that the typestring should be a quoted version of what would follow '# type: ' in a comment, 'as already defined above' (in the previous recommendation). In other words, Guido's current addition has two recommendations: 1. the syntax for a typestring 2. the use of a typestring (append it to a '# type: ' comment) If a decorator alternative uses the same syntax, a checker would need just one typestring parser. I think the conditional recommendation would be within the scope of what is appropriate for us to do.
I have to admit that I was not fully cognizant before than a comment could precede a docstring. -- Terry Jan Reedy

On Jan 11, 2016, at 13:38, Guido van Rossum <guido@python.org> wrote:
Sure. But you also decided that the type information has to be there at runtime. Anyway, I don't buy GPS's argument, but I think I buy yours. Even if there are good reasons to have annotations at runtime, and they'd apply to debugging/introspecting/etc. code during a 2.7->3.6 port just as much as in new 3.6 work, but I can see that they may not be worth _enough_ to justify the cost of extra runtime CPU (which can't be avoided in 2.7 the way it is in 3.6). And that, even if they were worth the cost, it may still not be worth trying to convince a team of that fact, especially without any hard information).
3.x code with annotations incurs a memory overhead, even though most runtime code is never going to use them. That was considered to be acceptable. So why isn't it acceptable for the same code before it's ported to 3.x? Or, conversely, if it isn't acceptable in 2.x, why isn't it a serious blocking regression that, once the port is completed and you're running under 3.x, you're now wasting memory for those useless annotations?
I'm not objecting to the memory overhead of using decorators,
OK, but GPS was. And he was also arguing that having annotations at runtime is useless. Which is an argument that was made against PEP 484, and considered and rejected at the time. Your argument is different, and seems convincing to me, but I can't retroactively change my reply to his email.

What about this? def embezzle(self, account: "PEP3107 annotation"): # type: (str) -> Any """Embezzle funds from account using fake receipts.""" <code goes here> --- And BTW in PEP484 text -> Functions with the @no_type_check decorator or with a # type: ignore comment should be treated as having no annotations. could be probably? -> Functions with the @no_type_check decorator or with a # type: ignore comment should be treated as having no type hints.

On Mon, Jan 11, 2016 at 1:48 PM, Pavol Lisy <pavol.lisy@gmail.com> wrote:
I don't understand your proposal -- this is not valid Python 2.7 syntax so we cannot use it.
In the context of the PEP the latter interpretation is already implied, so I don't think I need to update the text. -- --Guido van Rossum (python.org/~guido)

2016-01-11 22:52 GMT+01:00, Guido van Rossum <guido@python.org>:
I had two things in my mind: 1. suggest some possible impact in the future. In time we are writing code compatible with python2 and python3 we will have type hints comments under python3 too. And because they are more compatible, there is risk(?) that they could be more popular then original PEP484 (for python3) proposal! 2. PEP484 describe possibility how to support other use of annotations and propose to use # type: ignore but similar method how to preserve other use of annotations could be (for example): # type: (str) -> Any and this could combine goodness of type-hints-tools and other types of annotations. At least in deprecation period (if there will be any) for other annotation types.

On Mon, Jan 11, 2016 at 5:39 PM, Steven D'Aprano <steve@pearwood.info> wrote:
Actually my experience with -OO (and even -O) suggest that that's not a great model (e.g. it can't work with libraries like PLY that inspect docstrings). A better model might be to let people select this on a per module basis. Though I could also see a future where __annotations__ is a more space-efficient data structure than dict. Have you already run into a situation where __annotations__ takes up too much space? -- --Guido van Rossum (python.org/~guido)

On Mon, Jan 11, 2016 at 08:38:59PM -0800, Guido van Rossum wrote:
No at such, but it does seem an obvious and low-impact place to save some memory. Like doc strings, they're rarely used at runtime outside of the interactive interpreter. But your suggestion sounds more useful. -- Steve

I really like MAL's variation much better. Being able to see .__annotations__ at runtime feels like an important feature that we'd give up with the purely comment style. -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

On Mon, Jan 11, 2016 at 1:50 PM David Mertz <mertz@gnosis.cx> wrote:
I'd like people who demonstrate practical important production uses for having .__annotation__ information available at runtime to champion that. Both Google and Dropbox are looking at it as only being meaningful in the offline code analysis context. Even our (Google's) modified 2.7 with annotation grammar backported is just that, grammar only, no .__annotations__ or even validation of names while parsing. It may as well be a # type: comment. We explicitly chose not to use decorators due to their resource usage side effects. 2.7.x itself understandably is... highly unlikely to be modified... to put it lightly. So a backport of ignored annotation syntax is a non-starter there. In that sense I think the # type: comments are fine and are pretty much what I've been expecting to see. The only other alternative not yet mentioned would be to put the information in the docstring. But that has yet other side effects and challenges. So the comments make a lot of sense to recommend for Python 2 within the PEP. .__annotations__ isn't something any Python 2 code has ever had in the past. It can continue to live without it. I do not believe we need to formally recommend a decorator and its implementation in the PEP. (read another way: I do not expect Guido to do that work... but anyone is free to propose it and see if anyone else wants to adopt it) -gps

On 1/8/2016 6:04 PM, Guido van Rossum wrote:
I find the this separate signature line to be at least as readable as the intermixed 3.x version. I noticed the same thing as Lemburg (no runtime .__annotations__ attributes, but am not sure whether adding them in 2.x code is a good or bad thing.
To me, really needed.
Since I am personally pretty much done with 2.x, the details do not matter to me, but I think a suggested standard approach is a good idea. I also think a new informational PEP, with a reference added to 484, would be better. 'Type hints for 2.x and 2&3 code' For a helpful tool, I would at least want something that added a template comment, without dummy 'Any's to be erased, to each function. # type: (, , *) -> A GUI with suggestions from both type-inferencing and from a name -> type dictionary would be even nicer. Name to type would work really well for a project with consistent use of parameter names. -- Terry Jan Reedy

On 1/8/2016 6:04 PM, Guido van Rossum wrote:
Big +1 I maintain some packages that are single-source 2/3 compatible packages, thus we haven't been able to add type annotations yet (which I was initially skeptical about, but now love) without dropping py2 support. So even for packages that have already been ported to py3, this proposal would be great. -Robert

On Friday, January 8, 2016 at 3:06:05 PM UTC-8, Guido van Rossum wrote:
FWIW, we had the same problem at Google. (Almost) all our code is Python 2. However, we went the route of backporting the type annotations grammar from Python 3. We now run a custom Python 2 that knows about PEP 3107. The primary reasons are aesthetic - PEP 484 syntax is already a bit hard on the eyes (capitalized container names, square brackets, quoting, ...) , and squeezing it all into comments wouldn't have helped matters, and would have hindered adoption. We're still happy with our decision of running a custom Python 2, but your mileage might vary. It's certainly true that other tools (pylint etc.) need to learn to not be confused by the "odd" Python 2 syntax. [1] I have a prototype of such a tool, i mplemented as a 2to3 fixer. It's a
pytype (http://github.com/google/pytype) already does (context sensitive, path-sensitive) whole-program analysis, and we're working on making it (more) PEP 484 compatible. We're also writing a (2to3 based) tool for inserting the derived tools back into the source code. Should we join forces? Matthias

On Mon, Jan 11, 2016 at 10:10 AM, Matthias Kramm <kramm@google.com> wrote:
Yeah, we looked into this but we use many 3rd party tools that would not know what to do with the new syntax, so that's why we went the route of adding support for these comments to mypy.
Possibly. I haven't had any pushback about this from the Dropbox engineers who have seen this so far.
We had some relevant experience with pyxl, and basically it wasn't good -- too many tools had to had custom support added or simply can't be used on files containing pyxl syntax. (https://github.com/dropbox/pyxl)
I would love to! Perhaps we can take this discussion off line? -- --Guido van Rossum (python.org/~guido)

On Fri, Jan 08, 2016 at 03:04:58PM -0800, Guido van Rossum wrote: [...]
[...]
I don't understand this paragraph. Doesn't mypy (and any other type checker) have to support stub files? I thought that stub files are needed for extension files, among other things. So I would have expected that any Python 2 type checker would have to support stub files as well, regardless of whether inline #type comments are introduced or not. Will Python 3 type checkers be expected to support #type comments as well as annotations and stub files? -- Steve

On 9 January 2016 at 12:08, Steven D'Aprano <steve@pearwood.info> wrote:
Stub files are easy to use if you're using them *instead of* the original source file (e.g. annotating extension modules, or typeshed annotations for the standard library). Checking a stub file for consistency against the published API of the corresponding module also seems like it would be straightforward (while using both a stub file *and* inline annotations for the same API seems like it would be a bad idea, it's at least necessary to check that the *shape* of the API matches, even if there's no type information). However, if I'm understanding correctly, the problem Guido is talking about here is a different one: analysing a function *implementation* to ensure it is consistent with its own published API. That's relatively straightforward with inline annotations (whether function annotation based, comment based, or docstring based), but trickier if you have to pause the analysis, go look for the right stub file, load it, determine the expected public API, and then resume the analysis of the original function. The other downside of the stub file approach is the same reason it's not the preferred approach in Python 3: you can't see the annotations yourself when you're working on the function. Folks working mostly on solo and small team projects may not see the appeal of that, but when doing maintenance on large unfamiliar code bases, the improved local reasoning those kinds of inline notes help support can be very helpful.
Will Python 3 type checkers be expected to support #type comments as well as annotations and stub files?
#type comment support is already required for variables and attributes: https://www.python.org/dev/peps/pep-0484/#type-comments That requirement for type checkers to support comment based type hints would remain, even if we were to later add native syntactic support for variable and attribute typing. I read Guido's proposal here as offering something similar for function annotations, only going in the other direction: providing a variant spelling for function type hinting that can be used in single source Python 2/3 code bases that can't use function annotations. I don't have a strong opinion on the specifics, but am +1 on the general idea - I think the approach Dropbox are pursuing of adopting static type analysis first, and then migrating to Python 3 (or at least single source Python 2/3 support) second is going to prove to be a popular one, as it allows you to detect a lot of potential migration issues without necessarily having to be able to exercise those code paths in a test running under Python 3. The 3 kinds of annotation would then have 3 clear function level use cases: stub files: annotating third party libraries (e.g. for typeshed) #type comments: annotating single source Python 2/3 code function annotations: annotating Python 3 code Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

2016-01-09 0:04 GMT+01:00, Guido van Rossum <guido@python.org>:
Could not something like this -> def embezzle(self, account, funds=1000000, *fake_receipts): # def embezzle(self, account: str, funds: int = 1000000, *fake_receipts: str) -> None: """Embezzle funds from account using fake receipts.""" <code goes here> make 1. transition from python2 to python3 more simple? 2. python3 checkers more easily changeable to understand new python2 standard? 3. simpler impact to documentation (means also simpler knowledbase to be learn) about annotations?

+1, I would really like to try out type annotation support in Jython, given the potential for tying in with Java as a source of type annotations (basically the equivalent of stubs for free). I'm planning on sprinting on Jython 3 at PyCon, but let's face it, that's going to take a while to really finish. re the two approaches, both are workable with Jython: * lib2to3 is something we should support in Jython 2.7. There are a couple of data files that we don't support in the tests (too large of a method for Java bytecode in infinite_recursion.py, not terribly interesting), plus a few other tests that should work. Therefore lib2to3 should be in the next release (2.7.1). * Jedi now works with the last commit to Jython 2.7 trunk, passing whatever it means to run random tests using its sith script against its source. (The sith test does not pass with either CPython or Jython's stdlib, starting with bad_coding.py.) - Jim On Sat, Jan 9, 2016 at 9:09 AM, Eric Fahlgren <ericfahlgren@gmail.com> wrote:

Unless there's a huge outcry I'm going to add this as an informational section to PEP 484. -- --Guido van Rossum (python.org/~guido)

Done: https://hg.python.org/peps/rev/06f8470390c2 (I'm happy to change or move this if there *is* a serious concern -- but I figured if there isn't I might as well get it over with. On Mon, Jan 11, 2016 at 9:27 AM, Guido van Rossum <guido@python.org> wrote:
-- --Guido van Rossum (python.org/~guido)

On Sat, Jan 9, 2016 at 1:54 AM, Pavol Lisy <pavol.lisy@gmail.com> wrote:
There would still have to be some marker like "# type:" for the type checker to recognize -- I'm sure that plain comments with alternate 'def' statements are pretty common and we really don't want the type checker to be confused by those. I don't like that the form you propose has so much repetition -- the design of Python 3 annotations intentionally is the least redundant possible, and my (really Jukka's) proposal tries to keep that property. Modifying type checkers to support this syntax is easy (Jukka already did it for mypy). Note that type checkers already have to parse the source code without the help of Python's ast module, because there are other things in comments: PEP 484 specifies variable annotations and a few forms of `# type: ignore` comments. Regarding the idea of a decorator, this was discussed and rejected for the original PEP 484 proposal as well. The problem is similar to that with your 'def' proposal: too verbose. Also a decorator is more expensive (we're envisioning adding many thousands of decorators, and it would weigh down program startup). We don't envision needing to introspect __annotations__ at run time. (Also, we already use decorators quite heavily -- introducing a @typehint decorator would make the code less readable due to excessive stacking of decorators.) Our needs at Dropbox are several: first, we want to add annotations to the code so that new engineers can learn their way around the code quicker and refactoring will be easier; second, we want to automatically check conformance to the annotations as part of our code review and continuous integration processes (this is where mypy comes in); third, once we have annotated enough of the code we want to start converting it to Python 3 with as much automation is feasible. The latter part is as yet unproven, but there's got to be a better way than manually checking the output of 2to3 (whose main weakness is that it does not know the types of variables). We see many benefits of annotations and automatically checking them using mypy -- but we don't want the them to affect the runtime at all. -- --Guido van Rossum (python.org/~guido)

On Sat, 9 Jan 2016 at 11:31 Guido van Rossum <guido@python.org> wrote:
To help answer the question about whether this could help with porting code to Python 3, the answer is "yes"; it's not essential but definitely would be helpful. Between Modernize, pylint, `python2.7 -3`, and `python3 -bb` you cover almost all of the issues that can arise in moving to Python 3. But notice that half of those tools are running your code under an interpreter with a certain flag flipped, which means run-time checks that require excellent test coverage. With type annotations you can do offline, static checking which is less reliant on your tests covering all corner cases. Depending on how the tools choose to handle representing str/unicode in Python 2/3 code (i.e., say that if you specify the type as 'str' it's an error and anything that is 'unicode' is considered the 'str' type in Python 3?), I don't see why mypy can't have a 2/3 compatibility mode that warns against uses of, e.g. the bytes type that don't directly translate between Python 2 and 3 like indexing. That kind of static warning would definitely be beneficial to anyone moving their code over as they wouldn't need to rely on e.g., `python3 -bb ` and their tests to catch that common issue with bytes and indexing. There is also the benefit of gradual porting with this kind of offline checking. Since you can slowly add more type information, you can slowly catch more issues in your code. Relying on `python3 -bb`, though, requires you have ported all of your code over first before running it under Python 3 to catch some issues.

On 09.01.2016 00:04, Guido van Rossum wrote:
By using comments, the annotations would not be available at runtime via an .__annotations__ attribute and every tool would have to implement a parser for extracting them. Wouldn't it be better and more in line with standard Python syntax to use decorators to define them ? @typehint("(str, int, *str) -> None") def embezzle(self, account, funds=1000000, *fake_receipts): """Embezzle funds from account using fake receipts.""" <code goes here> This would work in Python 2 as well and could (optionally) add an .__annotations__ attribute to the function/method, automatically create a type annotations file upon import, etc. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Jan 09 2016)
::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ http://www.malemburg.com/

On Sat, Jan 9, 2016 at 4:48 AM M.-A. Lemburg <mal@egenix.com> wrote:
The goal of the # type: comments as described is to have this information for offline analysis of code, not to make it available at run time. Yes, a decorator syntax could be adopted if anyone needs that. I don't expect anyone does. Decorators and attributes would add run time cpu and memory overhead whether the information was going to be used at runtime or not (likely not; nobody is likely to *deploy* code that looks at __annotations__). -gps

On Jan 11, 2016, at 10:42, Gregory P. Smith <greg@krypto.org> wrote:
The goal of the # type: comments as described is to have this information for offline analysis of code, not to make it available at run time. Yes, a decorator syntax could be adopted if anyone needs that. I don't expect anyone does. Decorators and attributes would add run time cpu and memory overhead whether the information was going to be used at runtime or not (likely not; nobody is likely to deploy code that looks at __annotations__).
These same arguments were made against PEP 484 in the first place, and (I think rightly) dismissed. 3.x code with annotations incurs a memory overhead, even though most runtime code is never going to use them. That was considered to be acceptable. So why isn't it acceptable for the same code before it's ported to 3.x? Or, conversely, if it isn't acceptable in 2.x, why isn't it a serious blocking regression that, once the port is completed and you're running under 3.x, you're now wasting memory for those useless annotations? Meanwhile, when _are_ annotations useful at runtime? Mostly during the kind of debugging that you'll be doing during something like a port from 2.x to 3.x. While you're still, by necessity, running under 2.x. If they're not useful there, it's hard to imagine why they'd be useful after the port is done, when you're deploying your 3.x code. So it seems like using decorators (or backporting the syntax, as Google has done) has better be acceptable for 2.7, or the PEP 484 design has a serious problem, and in a few months we're going to see Dropbox and Google and everyone else demanding a way to use type hinting without wasting memory on annotations are runtime in 3.x.

On Mon, Jan 11, 2016 at 12:22 PM, Andrew Barnert <abarnert@yahoo.com> wrote:
The way I recall it the argument was made against using decorators for PEP 484 and we rightly decided not to use decorators.
I'm not objecting to the memory overhead of using decorators, but to the execution time (the extra function call). And the scope for the proposal is much smaller -- while PEP 484 is the first step on a long road towards integrating gradual (i.e. OPTIONAL) typing into Python, the proposal on the table today is only meant for annotating Python 2.7 code so we can get rid of it more quickly.
I'm not sure how to respond to this -- I disagree with your prediction but I don't think either of us really has any hard data from experience yet. I am however going to be building the kind of experience that might eventually be used to decide this, over the next few years. The first step is going to introduce annotations into Python 2.7 code, and I know my internal customers well enough to know that convincing them that we should use decorators for annotations would be a much bigger battle than putting annotations in comments. Since I have many other battles to fight I would like this one to be as short as possible. So it seems like using decorators (or backporting the syntax, as Google has
Again, I disagree with your assessment but it's difficult to prove anything without hard data. One possible argument may be that Python 3 offers a large package of combined run-time advantages, with some cost that's hard to separate. However, for Python 2.7 there's either a run-time cost or there's no run-time cost -- there's no run-time benefit. And I don't want to have to calculate how many extra machines we'll need to provision in order to make up for the run-time cost. -- --Guido van Rossum (python.org/~guido)

On 11.01.2016 22:38, Guido van Rossum wrote:
To clarify: My suggestion to use a simple decorator with essentially the same syntax as proposed for the "# type: comments " was meant as *additional* allowed syntax, not necessarily as the only one to standardize. I'm a bit worried that by standardizing on using comments for these annotations only, we'll end up having people not use the type annotations because they simply don't like the style of having function bodies begin with comments instead of doc-strings. I certainly wouldn't want to clutter up my code like that. Tools parsing Python 2 source code may also have a problem with this (e.g. not recognize the doc-string anymore). This simply reads better, IMO: @typehint("(str, int, *str) -> None") def embezzle(self, account, funds=1000000, *fake_receipts): """Embezzle funds from account using fake receipts.""" <code goes here> and it has the advantage of allowing to have the decorator do additional things such as taking the annotations and writing out a type annotations file for Python 3 and other tools to use. We could also use a variant of the two proposals and additionally allow this syntax: #@typehint("(str, int, *str) -> None") def embezzle(self, account, funds=1000000, *fake_receipts): """Embezzle funds from account using fake receipts.""" <code goes here> to avoid memory and runtime overhead, if that's a problem. Moving from one to the other would then be a simple search&replace over the source code. Or we could have -O remove all those typehint decorator calls from the byte code to a similar effect. Code written for Python 2 & 3 will have to stick to the proposed syntax for quite a while, so we should try to find something that doesn't introduce a new syntax variant of how to specify additional function/method properties, because people are inevitably going to start using the same scheme for all sorts of other crazy stuff and this would make Python code look closer to Java than necessary, IMO: @public_interface @rest_accessible @map_exceptions_to_http_codes def embezzle(self, account, funds=1000000, *fake_receipts): # type: (str, int, *str) -> None # raises: ValueError, TypeError # optimize: jit, inline_globals # tracebacks: hide_locals # reviewed_by: xyz, abc """Embezzle funds from account using fake receipts.""" <code goes here> -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Jan 11 2016)
::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ http://www.malemburg.com/

On 1/11/2016 5:38 PM, M.-A. Lemburg wrote:
Code with type comments will run on any standard 2.7 interpreter. Code with an @typehint decorator will have to either run on a nonstandard interpreter or import 'typehint' from somewhere other than the stdlib or define 'typehint' at the top of the file or have the decorators stripped out before public distribution. To me, these options come close to making the decorator inappropriate as a core dev recommendation. However, the new section of the PEP could have a short paragraph that mentions @typehint(typestring) as a possible alternative (with the options given above) and recommend that if a decorator is used, then the name should be 'typehint' (or something else agreed on) and that the typestring should be a quoted version of what would follow '# type: ' in a comment, 'as already defined above' (in the previous recommendation). In other words, Guido's current addition has two recommendations: 1. the syntax for a typestring 2. the use of a typestring (append it to a '# type: ' comment) If a decorator alternative uses the same syntax, a checker would need just one typestring parser. I think the conditional recommendation would be within the scope of what is appropriate for us to do.
I have to admit that I was not fully cognizant before than a comment could precede a docstring. -- Terry Jan Reedy

On Jan 11, 2016, at 13:38, Guido van Rossum <guido@python.org> wrote:
Sure. But you also decided that the type information has to be there at runtime. Anyway, I don't buy GPS's argument, but I think I buy yours. Even if there are good reasons to have annotations at runtime, and they'd apply to debugging/introspecting/etc. code during a 2.7->3.6 port just as much as in new 3.6 work, but I can see that they may not be worth _enough_ to justify the cost of extra runtime CPU (which can't be avoided in 2.7 the way it is in 3.6). And that, even if they were worth the cost, it may still not be worth trying to convince a team of that fact, especially without any hard information).
3.x code with annotations incurs a memory overhead, even though most runtime code is never going to use them. That was considered to be acceptable. So why isn't it acceptable for the same code before it's ported to 3.x? Or, conversely, if it isn't acceptable in 2.x, why isn't it a serious blocking regression that, once the port is completed and you're running under 3.x, you're now wasting memory for those useless annotations?
I'm not objecting to the memory overhead of using decorators,
OK, but GPS was. And he was also arguing that having annotations at runtime is useless. Which is an argument that was made against PEP 484, and considered and rejected at the time. Your argument is different, and seems convincing to me, but I can't retroactively change my reply to his email.

What about this? def embezzle(self, account: "PEP3107 annotation"): # type: (str) -> Any """Embezzle funds from account using fake receipts.""" <code goes here> --- And BTW in PEP484 text -> Functions with the @no_type_check decorator or with a # type: ignore comment should be treated as having no annotations. could be probably? -> Functions with the @no_type_check decorator or with a # type: ignore comment should be treated as having no type hints.

On Mon, Jan 11, 2016 at 1:48 PM, Pavol Lisy <pavol.lisy@gmail.com> wrote:
I don't understand your proposal -- this is not valid Python 2.7 syntax so we cannot use it.
In the context of the PEP the latter interpretation is already implied, so I don't think I need to update the text. -- --Guido van Rossum (python.org/~guido)

2016-01-11 22:52 GMT+01:00, Guido van Rossum <guido@python.org>:
I had two things in my mind: 1. suggest some possible impact in the future. In time we are writing code compatible with python2 and python3 we will have type hints comments under python3 too. And because they are more compatible, there is risk(?) that they could be more popular then original PEP484 (for python3) proposal! 2. PEP484 describe possibility how to support other use of annotations and propose to use # type: ignore but similar method how to preserve other use of annotations could be (for example): # type: (str) -> Any and this could combine goodness of type-hints-tools and other types of annotations. At least in deprecation period (if there will be any) for other annotation types.

On Mon, Jan 11, 2016 at 5:39 PM, Steven D'Aprano <steve@pearwood.info> wrote:
Actually my experience with -OO (and even -O) suggest that that's not a great model (e.g. it can't work with libraries like PLY that inspect docstrings). A better model might be to let people select this on a per module basis. Though I could also see a future where __annotations__ is a more space-efficient data structure than dict. Have you already run into a situation where __annotations__ takes up too much space? -- --Guido van Rossum (python.org/~guido)

On Mon, Jan 11, 2016 at 08:38:59PM -0800, Guido van Rossum wrote:
No at such, but it does seem an obvious and low-impact place to save some memory. Like doc strings, they're rarely used at runtime outside of the interactive interpreter. But your suggestion sounds more useful. -- Steve

I really like MAL's variation much better. Being able to see .__annotations__ at runtime feels like an important feature that we'd give up with the purely comment style. -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

On Mon, Jan 11, 2016 at 1:50 PM David Mertz <mertz@gnosis.cx> wrote:
I'd like people who demonstrate practical important production uses for having .__annotation__ information available at runtime to champion that. Both Google and Dropbox are looking at it as only being meaningful in the offline code analysis context. Even our (Google's) modified 2.7 with annotation grammar backported is just that, grammar only, no .__annotations__ or even validation of names while parsing. It may as well be a # type: comment. We explicitly chose not to use decorators due to their resource usage side effects. 2.7.x itself understandably is... highly unlikely to be modified... to put it lightly. So a backport of ignored annotation syntax is a non-starter there. In that sense I think the # type: comments are fine and are pretty much what I've been expecting to see. The only other alternative not yet mentioned would be to put the information in the docstring. But that has yet other side effects and challenges. So the comments make a lot of sense to recommend for Python 2 within the PEP. .__annotations__ isn't something any Python 2 code has ever had in the past. It can continue to live without it. I do not believe we need to formally recommend a decorator and its implementation in the PEP. (read another way: I do not expect Guido to do that work... but anyone is free to propose it and see if anyone else wants to adopt it) -gps

On 1/8/2016 6:04 PM, Guido van Rossum wrote:
I find the this separate signature line to be at least as readable as the intermixed 3.x version. I noticed the same thing as Lemburg (no runtime .__annotations__ attributes, but am not sure whether adding them in 2.x code is a good or bad thing.
To me, really needed.
Since I am personally pretty much done with 2.x, the details do not matter to me, but I think a suggested standard approach is a good idea. I also think a new informational PEP, with a reference added to 484, would be better. 'Type hints for 2.x and 2&3 code' For a helpful tool, I would at least want something that added a template comment, without dummy 'Any's to be erased, to each function. # type: (, , *) -> A GUI with suggestions from both type-inferencing and from a name -> type dictionary would be even nicer. Name to type would work really well for a project with consistent use of parameter names. -- Terry Jan Reedy

On 1/8/2016 6:04 PM, Guido van Rossum wrote:
Big +1 I maintain some packages that are single-source 2/3 compatible packages, thus we haven't been able to add type annotations yet (which I was initially skeptical about, but now love) without dropping py2 support. So even for packages that have already been ported to py3, this proposal would be great. -Robert

On Friday, January 8, 2016 at 3:06:05 PM UTC-8, Guido van Rossum wrote:
FWIW, we had the same problem at Google. (Almost) all our code is Python 2. However, we went the route of backporting the type annotations grammar from Python 3. We now run a custom Python 2 that knows about PEP 3107. The primary reasons are aesthetic - PEP 484 syntax is already a bit hard on the eyes (capitalized container names, square brackets, quoting, ...) , and squeezing it all into comments wouldn't have helped matters, and would have hindered adoption. We're still happy with our decision of running a custom Python 2, but your mileage might vary. It's certainly true that other tools (pylint etc.) need to learn to not be confused by the "odd" Python 2 syntax. [1] I have a prototype of such a tool, i mplemented as a 2to3 fixer. It's a
pytype (http://github.com/google/pytype) already does (context sensitive, path-sensitive) whole-program analysis, and we're working on making it (more) PEP 484 compatible. We're also writing a (2to3 based) tool for inserting the derived tools back into the source code. Should we join forces? Matthias

On Mon, Jan 11, 2016 at 10:10 AM, Matthias Kramm <kramm@google.com> wrote:
Yeah, we looked into this but we use many 3rd party tools that would not know what to do with the new syntax, so that's why we went the route of adding support for these comments to mypy.
Possibly. I haven't had any pushback about this from the Dropbox engineers who have seen this so far.
We had some relevant experience with pyxl, and basically it wasn't good -- too many tools had to had custom support added or simply can't be used on files containing pyxl syntax. (https://github.com/dropbox/pyxl)
I would love to! Perhaps we can take this discussion off line? -- --Guido van Rossum (python.org/~guido)
participants (14)
-
Andrew Barnert
-
Brett Cannon
-
David Mertz
-
Eric Fahlgren
-
Gregory P. Smith
-
Guido van Rossum
-
Jim Baker
-
M.-A. Lemburg
-
Matthias Kramm
-
Nick Coghlan
-
Pavol Lisy
-
Robert McGibbon
-
Steven D'Aprano
-
Terry Reedy