Optional static typing -- the crossroads
I have read pretty much the entire thread up and down, and I don't think I can keep up with responding to every individual piece of feedback. (Also, a lot of responses cancel each other out. :-) I think there are three broad categories of questions to think about next. (A) Do we even need this? (B) What syntax to use? (C) Does/should it support <feature X>? Taking these in turn: (A) Do we even need a standard for optional static typing? Many people have shown either support for the idea, or pointed to some other system that addresses the same issue. On the other hand, several people have claimed that they don't need it, or that they worry it will make Python less useful for them. (However, many of the detractors seem to have their own alternative proposal. :-) In the end I don't think we can ever know for sure -- but my intuition tells me that as long as we keep it optional, there is a real demand. In any case, if we don't start building something we'll never know whether it'll be useful, so I am going to take a leap of faith and continue to promote this idea. I am going to make one additional assumption: the main use cases will be linting, IDEs, and doc generation. These all have one thing in common: it should be possible to run a program even though it fails to type check. Also, adding types to a program should not hinder its performance (nor will it help :-). (B) What syntax should a standard system for optional static typing use? There are many interesting questions here, but at the highest level there are a few choices that constrain the rest of the discussion, and I'd like to start with these. I see three or four "families" of approaches, and I think the first order is to pick a family. (1) The mypy family. (http://mypy-lang.org/) This is characterized by its use of PEP 3107 function annotations and the constraint that its syntax must be valid (current) Python syntax that can be evaluated without errors at function definition time. However, mypy also supports collecting annotations in separate "stub" files; this is how it handles annotations for the stdlib and C extensions. When mypy annotations occur inline (not in a stub file) they are used to type check the body of the annotated function as well as input for type checking its callers. (2) The pytypedecl family. (https://github.com/google/pytypedecl) This is a custom syntax that can only be used in separate stub files. Because it is not constrained by Python's current syntax, its syntax is slightly more elegant than mypy. (3) The PyCharm family. ( http://www.jetbrains.com/pycharm/webhelp/using-docstrings-to-specify-types.h...) This is a custom syntax that lives entirely in docstrings. There is also a way to use stub files with this. (In fact, every viable approach has to support some form of stub files, if only to describe signatures for C extensions.) (I suppose we could add a 4th family that puts everything in comments, but I don't think anyone is seriously working on such a thing, and I don't see any benefits.) There's also a variant of (1) that Łukasz Langa would like to see -- use the syntactic position of function annotations but using a custom syntax (e.g. one similar to the pytypedecl syntax) that isn't evaluated at function-definition time. This would have to use "from __future__ import <something>" for backward compatibility. I'm skeptical about this though; it is only slightly more elegant than mypy, and it would open the floodgates of unconstrained language design. So how to choose? I've read passionate attacks and defenses of each approach. I've got a feeling that the three projects aren't all that different in maturity (all are well beyond the toy stage, none are quite ready for prime time). In terms of specific type system features (e.g. forward references, generic types, duck typing) I expect they are all acceptable, and all probably need some work (and there's no reason to assume that work can't be done). All support stubs so you can specify signatures for code you can't edit (whether C extension, stdlib or just opaque 3rd party code). To me there is no doubt that (1) is the most Pythonic approach. When we discussed PEP 3107 (function annotations) it was always my goal that these would eventually be used for type annotations. There was no consensus at the time on what the rules for type checking should be, but their syntactic position was never in doubt. So we decided to introduce "annotations" in Python 3 in the hope that 3rd party experiments would eventually produce something satisfactory. Mypy is one such experiment. One of the important lessons I draw from mypy is that type annotations are most useful to linters, and should (normally) not be used to enforce types at run time. They are also not useful for code generation. None of that was obvious when we were discussing PEP 3107! I don't buy the argument that PEP 3107 promises that annotations are completely free of inherent semantics. It promises compatibility, and I take that very seriously, but I think it is reasonable to eventually deprecate other uses of annotations -- there aren't enough significant other uses for them to warrant crippling type annotations forever. In the meantime, we won't be breaking existing use of annotations -- but they may confuse a type checker, whether a stand-alone linter like mypy or built into an IDE like PyCharm, and that may serve as an encouragement to look for a different solution. Most of the thornier issues brought up against mypy wouldn't go away if we adopted another approach: whether to use concrete or abstract types, the use of type variables, how to define type equivalence, the relationship between a list of ints and a list of objects, how to spell "something that implements the buffer interface", what to do about JSON, binary vs. text I/O and the signature of open(), how to check code that uses isinstance(), how to shut up the type checker when you know better... The list goes on. There will be methods whose type signature can't be spelled (yet). There will be code distributed with too narrowly defined types. Some programmers will uglify their code to please the type checker. There are questions about what to do for older versions of Python. I find mypy's story here actually pretty good -- the mypy codec may be a hack, but so is any other approach. Only the __future__ approach really loses out here, because you can't add a new __future__ import to an old version. So there you have it. I am picking the mypy family and I hope we can start focusing on specific improvements to mypy. I also hope that somebody will write converters from pytypedecl and PyCharm stubs into mypy stubs, so that we can reuse the work already put into stub definitions for those two systems. And of course I hope that PyCharm and pytypedecl will adopt mypy's syntax (initially in addition to their native syntax, eventually as their sole syntax). PS. I realize I didn't discuss question (C) much. That's intentional -- we can now start discussing specific mypy features in separate threads (or in this one :-). -- --Guido van Rossum (python.org/~guido)
On 15 August 2014 09:56, Guido van Rossum
I don't buy the argument that PEP 3107 promises that annotations are completely free of inherent semantics.
It's also worth noting the corresponding bullet point in PEP 3100 (under http://www.python.org/dev/peps/pep-3100/#core-language): * Add optional declarations for static typing [45] [10] [done] [10] Guido's blog ("Python Optional Typechecking Redux") http://www.artima.com/weblogs/viewpost.jsp?thread=89161 [45] PEP 3107 (Function Annotations) http://www.python.org/dev/peps/pep-3107
It promises compatibility, and I take that very seriously, but I think it is reasonable to eventually deprecate other uses of annotations -- there aren't enough significant other uses for them to warrant crippling type annotations forever. In the meantime, we won't be breaking existing use of annotations -- but they may confuse a type checker, whether a stand-alone linter like mypy or built into an IDE like PyCharm, and that may serve as an encouragement to look for a different solution.
Linters/checkers may also want to provide a configurable way to say "the presence of decorator <X> means the annotations on that function aren't type markers". That ties in with the recommendation we added to PEP 8 a while back: "It is recommended that third party experiments with annotations use an associated decorator to indicate how the annotation should be interpreted."
So there you have it. I am picking the mypy family and I hope we can start focusing on specific improvements to mypy. I also hope that somebody will write converters from pytypedecl and PyCharm stubs into mypy stubs, so that we can reuse the work already put into stub definitions for those two systems. And of course I hope that PyCharm and pytypedecl will adopt mypy's syntax (initially in addition to their native syntax, eventually as their sole syntax).
Having Argument Clinic generate appropriate annotations automatically could also be interesting. Regards, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Thu, Aug 14, 2014 at 9:40 PM, Nick Coghlan
On 15 August 2014 09:56, Guido van Rossum
wrote: I don't buy the argument that PEP 3107 promises that annotations are completely free of inherent semantics.
It's also worth noting the corresponding bullet point in PEP 3100 (under http://www.python.org/dev/peps/pep-3100/#core-language):
* Add optional declarations for static typing [45] [10] [done]
[10] Guido's blog ("Python Optional Typechecking Redux") http://www.artima.com/weblogs/viewpost.jsp?thread=89161 [45] PEP 3107 (Function Annotations) http://www.python.org/dev/peps/pep-3107
Such youthful optimism. :-)
Having Argument Clinic generate appropriate annotations automatically could also be interesting.
How much of the 3.5 stdlib is currently covered by Argument Clinic? I thought there's still a lot left to do. Might it be possible to convert some of pytypedecl's stubs into AC stubs? Alternatively, the AC info could be turned into mypy stubs. I'm just really hoping that between AC, pytypedecl, PyCharm and mypy we have specs for most builtins and extension modules in machine-readable form already, and we could use this combined information to bootstrap mypy's collection of stubs. -- --Guido van Rossum (python.org/~guido)
On Thu Aug 14 2014 at 9:47:45 PM Guido van Rossum
On Thu, Aug 14, 2014 at 9:40 PM, Nick Coghlan
wrote: On 15 August 2014 09:56, Guido van Rossum
wrote: I don't buy the argument that PEP 3107 promises that annotations are completely free of inherent semantics.
It's also worth noting the corresponding bullet point in PEP 3100 (under http://www.python.org/dev/peps/pep-3100/#core-language):
* Add optional declarations for static typing [45] [10] [done]
[10] Guido's blog ("Python Optional Typechecking Redux") http://www.artima.com/weblogs/viewpost.jsp?thread=89161 [45] PEP 3107 (Function Annotations) http://www.python.org/dev/peps/pep-3107
Such youthful optimism. :-)
Having Argument Clinic generate appropriate annotations automatically could also be interesting.
How much of the 3.5 stdlib is currently covered by Argument Clinic? I thought there's still a lot left to do. Might it be possible to convert some of pytypedecl's stubs into AC stubs? Alternatively, the AC info could be turned into mypy stubs. I'm just really hoping that between AC, pytypedecl, PyCharm and mypy we have specs for most builtins and extension modules in machine-readable form already, and we could use this combined information to bootstrap mypy's collection of stubs.
I believe we have partially intersecting subsets of builtins and stdlib coverage, union them all together and sanity check them and it's a good start but will likely still have giant holes to be filled in. We've been concentrating on 2.7 with the code analysis to generate pytypedecl pytd's for but have always assumed that argument clinic would be useful in providing annotation details for 3.4 onwards. Should it generate annotation files itself? possibly, but I'm not sure it is expressive enough to generate an ideal annotation. To start with I'd leave generating annotations Python builtins, extensions and internals itself out of CPython itself in 3.5. Such things can be pulled in with tools to generate them in later release once we're happy it is easy to maintain via the tools without a much human tweaking being required. -gps
On 8/15/2014 12:40 AM, Nick Coghlan wrote:
On 15 August 2014 09:56, Guido van Rossum
wrote: I don't buy the argument that PEP 3107 promises that annotations are completely free of inherent semantics.
It's also worth noting the corresponding bullet point in PEP 3100 (under http://www.python.org/dev/peps/pep-3100/#core-language):
* Add optional declarations for static typing [45] [10] [done] ... Linters/checkers may also want to provide a configurable way to say "the presence of decorator <X> means the annotations on that function aren't type markers". That ties in with the recommendation we added to PEP 8 a while back: "It is recommended that third party experiments with annotations use an associated decorator to indicate how the annotation should be interpreted."
Depending on the checker, this suggests that non-type-check annotations need not be deprecated. If a decorator wraps a function with an unannotated wrapper, then the checker should see the result as unannotated, rather than looking for a wrapped attribute. Also, a decorator can remove non-type annotations and act on them, store them in a closure variable, or store them on the function in a different name. For example.
def doodad(f): f.doodad = f.__annotations__ f.__annotations__ = {} return f
@doodad def f(x:'arg doodad')->'return:doodad': pass
f.__annotations__ {} f.doodad {'x': 'arg doodad', 'return': 'return:doodad'}
Given these possibilities, all that is needs be said is "After a function is post-processed by decorators, any remaining annotations should be for type-checking or documentation." For checkers that do look at the source, or the AST before compiling, the rule could be to ignore string annotations. Decorators can always eval, or perhaps safe_eval, strings. -- Terry Jan Reedy
On 15 August 2014 19:38, Terry Reedy
On 8/15/2014 12:40 AM, Nick Coghlan wrote:
On 15 August 2014 09:56, Guido van Rossum
wrote: I don't buy the argument that PEP 3107 promises that annotations are completely free of inherent semantics.
It's also worth noting the corresponding bullet point in PEP 3100 (under http://www.python.org/dev/peps/pep-3100/#core-language):
* Add optional declarations for static typing [45] [10] [done]
...
Linters/checkers may also want to provide a configurable way to say "the presence of decorator <X> means the annotations on that function aren't type markers". That ties in with the recommendation we added to PEP 8 a while back: "It is recommended that third party experiments with annotations use an associated decorator to indicate how the annotation should be interpreted."
Depending on the checker, this suggests that non-type-check annotations need not be deprecated. If a decorator wraps a function with an unannotated wrapper, then the checker should see the result as unannotated, rather than looking for a wrapped attribute. Also, a decorator can remove non-type annotations and act on them, store them in a closure variable, or store them on the function in a different name.
No, many (most?) linters and IDEs will run off the AST without actually executing the code, so they'll see the annotations, even if they get stripped by the decorator at runtime. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
As a lurker who barely reads this list at all, let me just add here that many of the typing-related questions that the BDFL presented as follow-up are, in my opinion, questions worth asking even outside of the annotation issue. (Mini wall of text incoming:) What I mean is, one of my grievances with Python is that the type hierarchy is poorly defined and difficult to use, in that the "types" presented in collections.abc in no way whatsoever interact with the builtins, and considering how many library and user types inherit from those ABCs (as is the purpose of those ABCs), it seems to me like a rather serious issue with using the type system. Consider the following:
from collections import abc as types isinstance(dict, types.Mapping) False isinstance(types.Mapping, dict) False isinstance(list, types.MutableSequence) False isinstance(types.MutableSequence, list) False isinstance(list, types.Sized) False
Furthermore:
class DictThing(types.MutableMapping): pass #Easier to subclass ... isinstance(DictThing, dict) False class DicterThing(dict): pass # Simpler ... isinstance(DicterThing, types.MutableMapping) False
And finally:
from collections import defaultdict isinstance(defaultdict, dict) False isinstance(defaultdict, types.MutableMapping) False
My conclusion is that to make lint-sort type-checkers worth their salt, the
Python type hierarchy needs to be fixed and properly integrated. There is
currently no obvious way to check if an object either is a list or behaves
like a list (the duck typing philosophy equates the two). (The current
shortest way is "isinstance(myvar, (list, types.MutableSequence)", but
that's not very obvious or Pythonic IMO.)
Before now I haven't bothered to put my thoughts to words, but this is
pretty decent motivation.
---------------------------------------------------------------------------
If people agree with my conclusion, then the ideal solution would be to
somehow merge the builtin types and the collections.abc types into one
single hierarchy, one single isinstance check, but that's probably
impossible.
The next thing would be to make one hierarchy the appropriate subclasses of
the other hierarchy, but either of those solutions has its own problems.
There're probably better solutions out there.
--Bill
On Fri, Aug 15, 2014 at 4:48 AM, Nick Coghlan
On 8/15/2014 12:40 AM, Nick Coghlan wrote:
On 15 August 2014 09:56, Guido van Rossum
wrote: I don't buy the argument that PEP 3107 promises that annotations are completely free of inherent semantics.
It's also worth noting the corresponding bullet point in PEP 3100 (under http://www.python.org/dev/peps/pep-3100/#core-language):
* Add optional declarations for static typing [45] [10] [done]
...
Linters/checkers may also want to provide a configurable way to say "the presence of decorator <X> means the annotations on that function aren't type markers". That ties in with the recommendation we added to PEP 8 a while back: "It is recommended that third party experiments with annotations use an associated decorator to indicate how the annotation should be interpreted."
Depending on the checker, this suggests that non-type-check annotations need not be deprecated. If a decorator wraps a function with an unannotated wrapper, then the checker should see the result as unannotated, rather
looking for a wrapped attribute. Also, a decorator can remove non-type annotations and act on them, store them in a closure variable, or store
On 15 August 2014 19:38, Terry Reedy
wrote: than them on the function in a different name.
No, many (most?) linters and IDEs will run off the AST without actually executing the code, so they'll see the annotations, even if they get stripped by the decorator at runtime.
Cheers, Nick.
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On 15 August 2014 21:01, Bill Winslow
Consider the following:
from collections import abc as types isinstance(dict, types.Mapping) False isinstance(types.Mapping, dict) False isinstance(list, types.MutableSequence) False isinstance(types.MutableSequence, list) False isinstance(list, types.Sized) False
You're doing instance checks on subclasses - that's never going to work. Once you account for the type/instance distinction, you can see everything is correctly registered:
from collections import abc as cabc issubclass(dict, cabc.Mapping) True issubclass(list, cabc.Sequence) True issubclass(list, cabc.Sized) True isinstance(dict(), cabc.Mapping) True isinstance(list(), cabc.Sequence) True isinstance(list(), cabc.Sized) True
(in Python 2, a couple of builtins claim ABCs they don't actually implement fully, but that's addressed in newer versions of Python 3) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
Oh dear lord, it's like a nightmare where you're late to class without
clothing... except it actually happened... I knew something like this would
happen when I decided to actually do something.
On Fri, Aug 15, 2014 at 6:50 AM, Nick Coghlan
On 15 August 2014 21:01, Bill Winslow
wrote: Consider the following:
from collections import abc as types isinstance(dict, types.Mapping) False isinstance(types.Mapping, dict) False isinstance(list, types.MutableSequence) False isinstance(types.MutableSequence, list) False isinstance(list, types.Sized) False
You're doing instance checks on subclasses - that's never going to work. Once you account for the type/instance distinction, you can see everything is correctly registered:
from collections import abc as cabc issubclass(dict, cabc.Mapping) True issubclass(list, cabc.Sequence) True issubclass(list, cabc.Sized) True isinstance(dict(), cabc.Mapping) True isinstance(list(), cabc.Sequence) True isinstance(list(), cabc.Sized) True
(in Python 2, a couple of builtins claim ABCs they don't actually implement fully, but that's addressed in newer versions of Python 3)
Cheers, Nick.
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On 15 August 2014 21:53, Bill Winslow
Oh dear lord, it's like a nightmare where you're late to class without clothing... except it actually happened... I knew something like this would happen when I decided to actually do something.
If it helps any, I had to type it into the interactive interpreter and get very confused for a moment before I realised what had happened. And I *knew* they were integrated, because I'd helped fix some of the bugs with the ABC non-conformance :) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Fri, Aug 15, 2014 at 7:23 AM, Bill Winslow
Oh dear lord, it's like a nightmare where you're late to class without clothing... except it actually happened... I knew something like this would happen when I decided to actually do something.
Don't worry. It's a too common happening, in one of the few places in which Python has no Zen. -- Juancarlo *Añez*
On 8/15/2014 5:48 AM, Nick Coghlan wrote:
On 15 August 2014 19:38, Terry Reedy
wrote: On 8/15/2014 12:40 AM, Nick Coghlan wrote:
On 15 August 2014 09:56, Guido van Rossum
wrote: I don't buy the argument that PEP 3107 promises that annotations are completely free of inherent semantics.
It's also worth noting the corresponding bullet point in PEP 3100 (under http://www.python.org/dev/peps/pep-3100/#core-language):
* Add optional declarations for static typing [45] [10] [done]
...
Linters/checkers may also want to provide a configurable way to say "the presence of decorator <X> means the annotations on that function aren't type markers". That ties in with the recommendation we added to PEP 8 a while back: "It is recommended that third party experiments with annotations use an associated decorator to indicate how the annotation should be interpreted."
I claim that the mere presence of a decorator in the *source* is not enough. The decorator for non-type markers should do something with .__annotations__ -- in particular, remove the non-type markers. The presence of a decorator in the source does not necessarily leave a trace on the returned function object for runtime detection.
Depending on the checker, this suggests that non-type-check annotations need not be deprecated. If a decorator wraps a function with an unannotated wrapper, then the checker should see the result as unannotated, rather than looking for a wrapped attribute. Also, a decorator can remove non-type annotations and act on them, store them in a closure variable, or store them on the function in a different name.
"Depending on the checker" alludes to the fact that there are two types of annotation consumers: those that read the source or parsed source (ast) and the annotations therein and those that look at the function .__annotations__ attribute after compilation. Inspect.signature is an example of the latter. (If there were none, there would be no purpose to .__annotations__!) Suppose an integer square root function were annotated with with type and non-type information.
def nsqrt(n:(int, 'random non-type info'))->(int, 'more non-type info'): pass
To me, inspect.signature already assumes that annotations are about type, which means that this horse has already left barn. In 3.4.1:
from inspect import signature as sig str(sig(nsqrt)) "(n:(
, 'random non-type info')) -> ( , 'more non-type info')"
Typing 'nsqrt(' in Idle and pausing a fraction of a second brings up a
calltip with the same string (without the outer quotes). To me, having
random non-type info in the signature and calltip is noise and therefore
wrong. So I agree that the non-standard annotation should be signaled by
a decorator *and* suggest that the decorator should remove the
non-standard annotation, which is easily done, so that the signature
string for the above would be
"(n:
No, many (most?)
The future relative proportion of pre- and post-compile annotation consumers is not relevant to my argument. The stdlib already has a important post-compile consumer that is already somewhat broken by non-type info remaining in .__annotations__.
linters and IDEs will run off the AST without actually executing the code, so they'll see the annotations, even if they get stripped by the decorator at runtime.
Being aware of this, I concluded the post with the following that already said this. "For checkers that do look at the source, or the AST before compiling," *and* I went on to suggest a solution. "the rule could be to ignore string annotations. Decorators can always eval, or perhaps safe_eval, strings." In other words, if type annotations were to be classes, as proposed, then non-type annotations should not be classes, so that pre-compile annotation consumers could easily ignore them. In particular, I suggested literal strings, which are easily recognized in source, as well as in asts. To put this all another way -- The new-in-3.0 annotation feature has two components: a python function source syntax, and a function compile behavior of adding a dict attribute as .__annotations__. If I understand correctly, Argument Clinic piggybacks on this by adding a mechanism to produce .__annotations__ from C source -- mainly for use by .signature, but also by another other .__annotations__ users. The two components -- source annotations and .__annotations__ dict -- each have their consumers. Currently, annotation values are untyped (or AnyTyped). Guido has proposed favoring type annotations. I support that and suggest that such favoritism is already needed for the annotation dict. So I would strengthen the PEP 8 recommendation. Guido has also suggested 'deprecating' non-type annotations. That would literally mean raising an error either when source is compiled or when def statements are executed. I think deprecation in this sense is both unwise, since non-type annotation were explicitly invited, and unnecessary for the purpose of favoring type annotations. The point of my previous post was to explore what restrictions *are* necessary. In summary, I suggest 1. use distinct syntax (this depends on what is adopted for type annotations); 2. decorate (as already suggested in PEP 8); 3. clean .__annotations__ (which should also be suggested in PEP 8). -- Terry Jan Reedy
On 15 August 2014 23:18, Terry Reedy
I claim that the mere presence of a decorator in the *source* is not enough. The decorator for non-type markers should do something with .__annotations__ -- in particular, remove the non-type markers. The presence of a decorator in the source does not necessarily leave a trace on the returned function object for runtime detection.
The way I understand it, mypy, which is what Guido's proposal sees its main potential user, operates at a stage similar to compilation in CPython. At which point anything after the colon in a parameter would be in its equivalent of __annotation__, regardless of what decorators add or remove. How would your other-annotation-removing decorator help mypy at all? By having it examine the decorator source and infer what kind of operation the decorator does? That doesn't sound very compile-stagey at all.
To me, inspect.signature already assumes that annotations are about type, which means that this horse has already left barn. In 3.4.1:
from inspect import signature as sig str(sig(nsqrt)) "(n:(
, 'random non-type info')) -> ( , 'more non-type info')"
`inspect.signature` makes no such assumption, it only relays what it found on the function's __annotation__ attribute. I don't know where this nsqrt function comes from, but it is responsible for having set up those annotations, which seem to be mere documentation if the "random non-type info" is a string.
"the rule could be to ignore string annotations. Decorators can always eval, or perhaps safe_eval, strings."
In other words, if type annotations were to be classes, as proposed, then non-type annotations should not be classes, so that pre-compile annotation consumers could easily ignore them. In particular, I suggested literal strings, which are easily recognized in source, as well as in asts.
"Oh, everything that's not type-checking can be expressed in a string, maybe to be eval'ed, say, with sys,_getframe(-1) locals." No, and that's incredibly ugly and may not work on alternate implementations of Python. If a PEP is made to standardize typing attributes for annotations, perhaps in the form of a typing type hierarchy or a type decorator(ie. typing.Like(AClass)), or annotation namespacing of some sort(decorate the function with "hey! please typecheck me!"?), then couldn't tools that rely on those attributes pick out what's relevant to them on their own, thanks to the standardization? I find the whole idea of having so far equal uses of function annotations be bullied aside for the good of a concept yet foreign to Python very arrogant and unnecessary.
On Fri, Aug 15, 2014 at 05:18:11PM -0400, Terry Reedy wrote: [...]
"Depending on the checker" alludes to the fact that there are two types of annotation consumers: those that read the source or parsed source (ast) and the annotations therein and those that look at the function .__annotations__ attribute after compilation. Inspect.signature is an example of the latter. (If there were none, there would be no purpose to .__annotations__!)
Agreed.
Suppose an integer square root function were annotated with with type and non-type information.
def nsqrt(n:(int, 'random non-type info'))->(int, 'more non-type info'): pass
I don't think it is reasonable to expect arbitrary annotation tools to interoperate, unless they are specifically designed to interoperate. If tool A expects annotations to be a certain thing, and tool B expects them to be a different thing, they are going to confuse each other. We need a convention so that tools which expect to work with annotations can identify which annotations are aimed at them. E.g. tool A might decorate the function with a marker that says "A", so that tool B knows to skip those functions. And vice versa. In the absence of any such marker, annotations can be assumed to be standard mypy-style type annotations. (The nature of this marker probably should be standardized. I suggest a key/item in __annotations__, where the key cannot clash with parameter names.) The easiest way to apply that marker is with a decorator: perhaps the typing module could provide a standard decorator that all annotation tools can recognise at compile-time: @register_annotations(A.marker) # for example def function(x:"something understandable by A"): ... # inside module A marker = object() def introspect(func): if magic(func) is marker: # okay to operate on func
To me, inspect.signature already assumes that annotations are about type, which means that this horse has already left barn. In 3.4.1:
I don't see how that follows. Your example demonstrates that inspect treats annotations as arbitrary Python expressions (which is what they are). You annotate the n parameter with the expression (int, "random non-type info"), which is a tuple of two objects. And signature dutifully reports that:
from inspect import signature as sig str(sig(nsqrt)) "(n:(
, 'random non-type info')) -> ( , 'more non-type info')" Typing 'nsqrt(' in Idle and pausing a fraction of a second brings up a calltip with the same string (without the outer quotes). To me, having random non-type info in the signature and calltip is noise and therefore wrong.
Then don't put random info in the annotations :-) If Idle has documented that the calltip is *always* type information, then Idle is wrong. Currently, there is no standard interpretation of function annotations, and if Idle documentation says otherwise, it is the documentation which is wrong. In the future, I can see that Idle might want to only display calltips that it knows contain type information, or perhaps show them slightly differently if they are not type annotations. Or perhaps not... see below.
So I agree that the non-standard annotation should be signaled by a decorator
I agree whole-heartedly to this.
*and* suggest that the decorator should remove the non-standard annotation, which is easily done, so that the signature string for the above would be
But I disagree equally as strongly to this. Other uses of annotations are just as valid as static typing, and may equally want to be available for runtime introspection. All we need is some sort of standardised runtime marker whereby tools can decide whether or not they should use the annotations.
The future relative proportion of pre- and post-compile annotation consumers is not relevant to my argument. The stdlib already has a important post-compile consumer that is already somewhat broken by non-type info remaining in .__annotations__.
I think it is only broken if you treat Idle calltips as displaying *types*. If you treat Idle calltips as displaying *annotations* no matter what the nature of those annotations, then it is not broken in the least. You yourself call them *call* tips, not "type tips", so there could be useful information provided other than the types of arguments. Consider two (imaginary) annotations in a graphics library: def move(x: int, y:int): ... def move(x: "distance along the left-right axis", y: "distance along the up-down axis"): ... I think that the second would be far more useful in a library aimed at beginners. [...]
Guido has also suggested 'deprecating' non-type annotations. That would literally mean raising an error either when source is compiled or when def statements are executed.
Not necessarily. It could mean just documenting that we shouldn't use function annotations for anything other than specifying types. The usual procedure for deprecations is: * for at least one release, tell people not to do this, but don't raise a warning or an exception; * for at least one release, raise a warning but not an exception; * for at least one release, raise an exception "For at least one release" might mean "until Python 5000".
I think deprecation in this sense is both unwise, since non-type annotation were explicitly invited, and unnecessary for the purpose of favoring type annotations.
I whole-heartedly agree with this part! -- Steven
Le 15/08/2014 00:40, Nick Coghlan a écrit :
Having Argument Clinic generate appropriate annotations automatically could also be interesting.
That would be great actually. That's one of the things AC should eventually be able to bring to the table (thank you Larry :-)). Being able to access the concrete implementation (without the boxing / unboxing wrapper) could also be quite useful for people who try to shave off some interpretation overhead :-) Regards Antoine.
On Aug 14, 2014, at 4:56 PM, Guido van Rossum
There's also a variant of (1) that Łukasz Langa would like to see -- use the syntactic position of function annotations but using a custom syntax (e.g. one similar to the pytypedecl syntax) that isn't evaluated at function-definition time. This would have to use "from __future__ import <something>" for backward compatibility. I'm skeptical about this though; it is only slightly more elegant than mypy, and it would open the floodgates of unconstrained language design.
I see the decision has been made. For the curious, the design would be as close as possible to PEP 3107. The biggest wins would be first-class annotations for variables (supported in Mypy as comments) and support for forward-references without the need to fall back to strings. I’m also not a fan of the square brackets for generics (those brackets mean lookup!) but a BDFL once said that “language evolution is the art of compromise” and one cannot disagree with that.
So there you have it. I am picking the mypy family and I hope we can start focusing on specific improvements to mypy.
Alright! That sounds good. With the syntax mostly out of the way, the next issue to handle is the typing module. dict, MutableMapping, and now Dict… One step too far. We should be able to re-use ABCs for that, e.g. to add support for union types and generics. Lots of decisions ahead (covariance, casting, multiple dispatch, etc.) but we’ll get there. The typing module as aliases for the updated ABCs sounds like a fair compromise, although I’m worried about the details of that approach (for starters: we need to bundle the new ABCMeta to support union types but having both implementations at runtime feels very wrong). Moreover, with dynamic type registration and duck typing isinstance(), there will be challenges to cover. I’m happy to go and slay that dragon. As a side note, I’m happy you’re willing to agree on str | None. This reads really well and is concise enough to not require aliasing to be usable. While we’re at slaying dragons, I’ll also silently make str non-iterable so that we can use Sequence[str] meaningfully from now on… How about that? -- Best regards, Łukasz Langa WWW: http://lukasz.langa.pl/ Twitter: @llanga IRC: ambv on #python-dev
On Fri, Aug 15, 2014 at 12:48 AM, Łukasz Langa
On Aug 14, 2014, at 4:56 PM, Guido van Rossum
wrote: There's also a variant of (1) that Łukasz Langa would like to see -- use the syntactic position of function annotations but using a custom syntax (e.g. one similar to the pytypedecl syntax) that isn't evaluated at function-definition time. This would have to use "from __future__ import <something>" for backward compatibility. I'm skeptical about this though; it is only slightly more elegant than mypy, and it would open the floodgates of unconstrained language design.
I see the decision has been made. For the curious, the design would be as close as possible to PEP 3107. The biggest wins would be first-class annotations for variables (supported in Mypy as comments) and support for forward-references without the need to fall back to strings.
You can probably come up with a notation for first-class variable annotations, e.g. x: Sequence[int] = [] The value might be optional. The question is though, would the type (Sequence[int]) be stored anyway? Also, in a class body, does it define a class var or an instance var (or doesn't it matter?). Does this need a 'var' keyword to be ambiguous? I propose to disallow declaring multiple variables in this style, since it's hard to decide whether the comma should bind tighter than the '=' sign (as in assignments) or less tight (as in function headings).
I’m also not a fan of the square brackets for generics (those brackets mean lookup!) but a BDFL once said that “language evolution is the art of compromise” and one cannot disagree with that.
So there you have it. I am picking the mypy family and I hope we can start focusing on specific improvements to mypy.
Alright! That sounds good. With the syntax mostly out of the way, the next issue to handle is the typing module. dict, MutableMapping, and now Dict… One step too far. We should be able to re-use ABCs for that, e.g. to add support for union types and generics. Lots of decisions ahead (covariance, casting, multiple dispatch, etc.) but we’ll get there.
The typing module as aliases for the updated ABCs sounds like a fair compromise, although I’m worried about the details of that approach (for starters: we need to bundle the new ABCMeta to support union types but having both implementations at runtime feels very wrong). Moreover, with dynamic type registration and duck typing isinstance(), there will be challenges to cover. I’m happy to go and slay that dragon.
As a side note, I’m happy you’re willing to agree on str | None. This reads really well and is concise enough to not require aliasing to be usable.
While we’re at slaying dragons, I’ll also silently make str non-iterable so that we can use Sequence[str] meaningfully from now on… How about that?
I hope you meant that as a joke. We missed our chance for that one with Python 3.0. We must live with it. -- --Guido van Rossum (python.org/~guido)
On Aug 15, 2014, at 8:17 AM, Guido van Rossum
You can probably come up with a notation for first-class variable annotations, e.g.
x: Sequence[int] = []
Yes, that syntax is out of scope for now, though, right? If I understand your reasoning behind choosing Mypy’s function annotation syntax, we don’t want to create programs that require Python 3.5+ just to be parsed. If we were to introduce first-class variable typing, yes, the syntax you propose is what I also had in mind.
The value might be optional. The question is though, would the type (Sequence[int]) be stored anyway? Also, in a class body, does it define a class var or an instance var (or doesn't it matter?).
I wouldn’t change the current behaviour: class C: cls_member: str = ‘on the class’ def __init__(self): self.obj_member: str = ‘on the instance' self.cls_member = 2 # that’s the real question: type error or an instance member? For that last case, even though it’s currently valid Python, my intuition tells me for Mypy to treat it as an error.
Does this need a 'var' keyword to be ambiguous?
I fail to see any additional value provided by such keyword. What would stop people from doing var i = 1 I don’t think we want to end up with that.
I propose to disallow declaring multiple variables in this style, since it's hard to decide whether the comma should bind tighter than the '=' sign (as in assignments) or less tight (as in function headings).
Right. I wonder if we even need this. For lines that use multiple assignment just for brevity, they can switch to multiple lines for typing. For common cases like: host, port = origin.rsplit(‘:’, 1) successful, errors = query_the_world(hostnames) I think the types can be easily inferred (assuming rsplit and query_the_world are annotated).
While we’re at slaying dragons, I’ll also silently make str non-iterable so that we can use Sequence[str] meaningfully from now on… How about that?
I hope you meant that as a joke. We missed our chance for that one with Python 3.0. We must live with it.
Yes, that was obviously just a joke. By the way, is the PEP number 4000 free? Asking for a friend. -- Best regards, Łukasz Langa WWW: http://lukasz.langa.pl/ Twitter: @llanga IRC: ambv on #python-dev
On Fri, Aug 15, 2014 at 12:24 PM, Łukasz Langa
On Aug 15, 2014, at 8:17 AM, Guido van Rossum
wrote: You can probably come up with a notation for first-class variable annotations, e.g.
x: Sequence[int] = []
Yes, that syntax is out of scope for now, though, right? If I understand your reasoning behind choosing Mypy’s function annotation syntax, we don’t want to create programs that require Python 3.5+ just to be parsed.
That's stronger than I meant it. The List<T> proposal would completely prevent the new typing syntax from being backported. Having to refrain from adding types for variables (or being required to use the inferior "magic comment" syntax) is a much smaller burden.
If we were to introduce first-class variable typing, yes, the syntax you propose is what I also had in mind.
It might be a separate PEP.
The value might be optional. The question is though, would the type (Sequence[int]) be stored anyway? Also, in a class body, does it define a class var or an instance var (or doesn't it matter?).
I wouldn’t change the current behaviour:
class C: cls_member: str = ‘on the class’
def __init__(self): self.obj_member: str = ‘on the instance' self.cls_member = 2 # that’s the real question: type error or an instance member?
For that last case, even though it’s currently valid Python, my intuition tells me for Mypy to treat it as an error.
I disagree -- it's a very common idiom to set (immutable) default values on the class for what is meant to be an instance variable. This is why I called it out as a question we need to answer.
Does this need a 'var' keyword to be ambiguous?
I fail to see any additional value provided by such keyword. What would stop people from doing
var i = 1
I don’t think we want to end up with that.
In a different world it could be used to address the issue of typos going unnoticed, but I think it would be too big a departure from current PYthon practice.
I propose to disallow declaring multiple variables in this style, since it's hard to decide whether the comma should bind tighter than the '=' sign (as in assignments) or less tight (as in function headings).
Right. I wonder if we even need this. For lines that use multiple assignment just for brevity, they can switch to multiple lines for typing. For common cases like:
host, port = origin.rsplit(‘:’, 1) successful, errors = query_the_world(hostnames)
I think the types can be easily inferred (assuming rsplit and query_the_world are annotated).
Sure. It's just that people would be expecting it to work based on generalizations from other forms -- it's just that if you generalize from argument lists you end up with something different than when you generalize from assignment. I think it's reasonable to disallow a, b: Tuple[int, float] = 42, 3.14 but to allow (a, b): Tuple[int, float] = (42, 3.14)
While we’re at slaying dragons, I’ll also silently make str non-iterable
so that we can use Sequence[str] meaningfully from now on… How about that?
I hope you meant that as a joke. We missed our chance for that one with Python 3.0. We must live with it.
Yes, that was obviously just a joke. By the way, is the PEP number 4000 free? Asking for a friend.
I totally missed the joke. :-( -- --Guido van Rossum (python.org/~guido)
Łukasz Langa wrote:
class C: cls_member: str = ‘on the class’
def __init__(self): self.obj_member: str = ‘on the instance' self.cls_member = 2 # that’s the real question: type error or an instance member?
I think not treating it as an error would make it hard to reason about the type of x.cls_member for an instance x of C. Its type would depend on whether del x.cls_member had been performed on x. Code which relied on them being different types would be rather confusing to a human reader too, so it's probably fine to discourage that. -- Greg
On Aug 15, 2014, at 5:26 PM, Greg Ewing
Łukasz Langa wrote:
class C: cls_member: str = ‘on the class’ def __init__(self): self.obj_member: str = ‘on the instance' self.cls_member = 2 # that’s the real question: type error or an instance member?
I think not treating it as an error would make it hard to reason about the type of x.cls_member for an instance x of C. Its type would depend on whether del x.cls_member had been performed on x.
Code which relied on them being different types would be rather confusing to a human reader too, so it's probably fine to discourage that.
That was my reasoning exactly. +1 -- Best regards, Łukasz Langa WWW: http://lukasz.langa.pl/ Twitter: @llanga IRC: ambv on #python-dev
On Fri, Aug 15, 2014 at 9:48 AM, Łukasz Langa
As a side note, I’m happy you’re willing to agree on str | None. This reads really well and is concise enough to not require aliasing to be usable.
The common use is not all that concise: def foo(bar: int | None=None): pass Or alternatively it could be: def foo(bar: int=None): pass if the default was automatically allowed. Also... Does None magically mean NoneType in type definitions?
On Aug 15, 2014 8:43 AM, "Petr Viktorin"
On Fri, Aug 15, 2014 at 9:48 AM, Łukasz Langa
wrote: ... As a side note, I’m happy you’re willing to agree on str | None. This
reads
really well and is concise enough to not require aliasing to be usable.
The common use is not all that concise: def foo(bar: int | None=None): pass
Or alternatively it could be: def foo(bar: int=None): pass if the default was automatically allowed.
Good idea.
Also... Does None magically mean NoneType in type definitions?
Yes. --Guido
On Fri, Aug 15, 2014 at 5:55 PM, Guido van Rossum
Also... Does None magically mean NoneType in type definitions?
Yes.
This would mean either that `(None | None) is None`, or that (x | None) is not always "optional x". And if type objects grow any other common functionality, None will have to support that as well.
On Fri, Aug 15, 2014 at 9:48 AM, Petr Viktorin
On Fri, Aug 15, 2014 at 5:55 PM, Guido van Rossum
wrote: ... Also... Does None magically mean NoneType in type definitions?
Yes.
This would mean either that `(None | None) is None`, or that (x | None) is not always "optional x". And if type objects grow any other common functionality, None will have to support that as well.
Perhaps None itself should not implement any of this, and the __ror__ method on ABCs should implement it. That way, None|Mapping and Mapping|None would both work, yet None|None would still be the TypeError it is today. -- --Guido van Rossum (python.org/~guido)
On Fri, Aug 15, 2014 at 7:00 PM, Guido van Rossum
On Fri, Aug 15, 2014 at 9:48 AM, Petr Viktorin
wrote: On Fri, Aug 15, 2014 at 5:55 PM, Guido van Rossum
wrote: ... Also... Does None magically mean NoneType in type definitions?
Yes.
This would mean either that `(None | None) is None`, or that (x | None) is not always "optional x". And if type objects grow any other common functionality, None will have to support that as well.
Perhaps None itself should not implement any of this, and the __ror__ method on ABCs should implement it. That way, None|Mapping and Mapping|None would both work, yet None|None would still be the TypeError it is today.
... and that (x|None) does not always mean "optional x". Is this case special enough?
On Fri, Aug 15, 2014 at 10:19 AM, Petr Viktorin
On Fri, Aug 15, 2014 at 7:00 PM, Guido van Rossum
wrote: On Fri, Aug 15, 2014 at 9:48 AM, Petr Viktorin
wrote: On Fri, Aug 15, 2014 at 5:55 PM, Guido van Rossum
wrote: ... Also... Does None magically mean NoneType in type definitions?
Yes.
This would mean either that `(None | None) is None`, or that (x | None) is not always "optional x". And if type objects grow any other common functionality, None will have to support that as well.
Perhaps None itself should not implement any of this, and the __ror__ method on ABCs should implement it. That way, None|Mapping and Mapping|None would both work, yet None|None would still be the TypeError it is today.
... and that (x|None) does not always mean "optional x". Is this case special enough?
I'm not following. The proposal seems to be to add __or__ and __ror__ methods to type itself requiring the other argument to be also a type, or the special case None (which is a value, not a type). -- --Guido van Rossum (python.org/~guido)
On Fri, Aug 15, 2014 at 7:36 PM, Guido van Rossum
On Fri, Aug 15, 2014 at 10:19 AM, Petr Viktorin
wrote: On Fri, Aug 15, 2014 at 7:00 PM, Guido van Rossum
wrote: On Fri, Aug 15, 2014 at 9:48 AM, Petr Viktorin
wrote: On Fri, Aug 15, 2014 at 5:55 PM, Guido van Rossum
wrote: ... Also... Does None magically mean NoneType in type definitions?
Yes.
This would mean either that `(None | None) is None`, or that (x | None) is not always "optional x". And if type objects grow any other common functionality, None will have to support that as well.
Perhaps None itself should not implement any of this, and the __ror__ method on ABCs should implement it. That way, None|Mapping and Mapping|None would both work, yet None|None would still be the TypeError it is today.
... and that (x|None) does not always mean "optional x". Is this case special enough?
I'm not following. The proposal seems to be to add __or__ and __ror__ methods to type itself requiring the other argument to be also a type, or the special case None (which is a value, not a type).
My concern is that if someone does programmatic type declaration manipulation/generation, there's now a special case to keep in mind. Instead of def optional(t): return t | None it's now: def optional(t): if t is None: return t else: return t | None because unlike other type declarations, None doesn't have __or__, or any other operation that types will gain in the future as this proposal matures. But maybe this is will never be a valid use case?
Doesn't seem a big deal. The only place where you'd see an implementation
of optional() would be in typing.py, and optional(None) is redundant anyway.
On Fri, Aug 15, 2014 at 10:51 AM, Petr Viktorin
On Fri, Aug 15, 2014 at 7:36 PM, Guido van Rossum
wrote: On Fri, Aug 15, 2014 at 10:19 AM, Petr Viktorin
wrote: On Fri, Aug 15, 2014 at 7:00 PM, Guido van Rossum
wrote: On Fri, Aug 15, 2014 at 9:48 AM, Petr Viktorin
wrote: On Fri, Aug 15, 2014 at 5:55 PM, Guido van Rossum
wrote: ... > Also... Does None magically mean NoneType in type definitions?
Yes.
This would mean either that `(None | None) is None`, or that (x | None) is not always "optional x". And if type objects grow any other common functionality, None will have to support that as well.
Perhaps None itself should not implement any of this, and the __ror__ method on ABCs should implement it. That way, None|Mapping and Mapping|None would both work, yet None|None would still be the TypeError it is today.
... and that (x|None) does not always mean "optional x". Is this case special enough?
I'm not following. The proposal seems to be to add __or__ and __ror__ methods to type itself requiring the other argument to be also a type, or the special case None (which is a value, not a type).
My concern is that if someone does programmatic type declaration manipulation/generation, there's now a special case to keep in mind. Instead of
def optional(t): return t | None
it's now:
def optional(t): if t is None: return t else: return t | None
because unlike other type declarations, None doesn't have __or__, or any other operation that types will gain in the future as this proposal matures.
But maybe this is will never be a valid use case?
-- --Guido van Rossum (python.org/~guido)
Hi all,
Has the syntax for specifying type been fully decided on already?
Using brackets may confuse new Python programmers. Since specifying type in
Python is fairly new anyway, what do you all think of introducing angle
brackets into Python instead? Other languages use angle brackets to specify
types. It provides a good separation between type specification and list
indexing.
I'm also worried that using square brackets will cause confusion as that
notation is generally associated with array declarations in other
languages. Even in Python, MyClass[int] may be confused with getting a key
called int from some MyClass.
dict
Doesn't seem a big deal. The only place where you'd see an implementation of optional() would be in typing.py, and optional(None) is redundant anyway.
On Fri, Aug 15, 2014 at 10:51 AM, Petr Viktorin
wrote: On Fri, Aug 15, 2014 at 7:36 PM, Guido van Rossum
wrote: On Fri, Aug 15, 2014 at 10:19 AM, Petr Viktorin
wrote: On Fri, Aug 15, 2014 at 7:00 PM, Guido van Rossum
wrote: On Fri, Aug 15, 2014 at 9:48 AM, Petr Viktorin
wrote: On Fri, Aug 15, 2014 at 5:55 PM, Guido van Rossum
wrote: ... >> Also... Does None magically mean NoneType in type definitions? > > Yes.
This would mean either that `(None | None) is None`, or that (x | None) is not always "optional x". And if type objects grow any other common functionality, None will have to support that as well.
Perhaps None itself should not implement any of this, and the __ror__ method on ABCs should implement it. That way, None|Mapping and Mapping|None would both work, yet None|None would still be the TypeError it is today.
... and that (x|None) does not always mean "optional x". Is this case special enough?
I'm not following. The proposal seems to be to add __or__ and __ror__ methods to type itself requiring the other argument to be also a type, or the special case None (which is a value, not a type).
My concern is that if someone does programmatic type declaration manipulation/generation, there's now a special case to keep in mind. Instead of
def optional(t): return t | None
it's now:
def optional(t): if t is None: return t else: return t | None
because unlike other type declarations, None doesn't have __or__, or any other operation that types will gain in the future as this proposal matures.
But maybe this is will never be a valid use case?
-- --Guido van Rossum (python.org/~guido)
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On Aug 15, 2014, at 11:43 AM, Sunjay Varma
Hi all, Has the syntax for specifying type been fully decided on already?
Using brackets may confuse new Python programmers. (…) Other languages use angle brackets to specify types.
I also agree that angle brackets would be nicer. Guido decided against it for pragmatic reasons: 1. angle brackets would create Python source code incompatible with any version lower than 3.5 2. angle brackets would complicate the lexer (normally you expect < and > to be spaced, in this case it wouldn’t) 3. angle brackets would require a new mechanism in Python to store this kind of expression within the type; this is still true for generics expressed with square brackets but at least you can use the existing nuts and bolts of Python classes All in all, this is more trouble than it’s worth. -- Best regards, Łukasz Langa WWW: http://lukasz.langa.pl/ Twitter: @llanga IRC: ambv on #python-dev
On Aug 15, 2014 2:55 PM, "Łukasz Langa"
On Aug 15, 2014, at 11:43 AM, Sunjay Varma
wrote: Hi all, Has the syntax for specifying type been fully decided on already?
Using brackets may confuse new Python programmers. (…) Other languages
use angle brackets to specify types.
I also agree that angle brackets would be nicer. Guido decided against it
for pragmatic reasons:
1. angle brackets would create Python source code incompatible with any version lower than 3.5
I'm all for compatibility, but Python 3 already breaks compatibility with Python 2. Why not add this feature in Python 4 (or whatever the next breaking release is) and do it "right" the first time. I don't think it makes sense to start muddling up the different semantic meanings of Python's operations just because we think it will break in an older version. This is such a big and important change. It deserves its own syntax (and if necessary a new version number as well).
2. angle brackets would complicate the lexer (normally you expect < and > to be spaced, in this case it wouldn’t) 3. angle brackets would require a new mechanism in Python to store this kind of expression within the type; this is still true for generics expressed with square brackets but at least you can use the existing nuts and bolts of Python classes
Angle brackets were just a suggestion as they are used frequently by other languages. Even braces would be more appropriate as they're already built into the lexer and dict{int, str} clearly means something different than dict[int, str].
All in all, this is more trouble than it’s worth.
I can understand that it's easier to use what's already there, but I don't agree with doing something just because it's easier. Especially when the side effects are not at all appealing. Sunjay
On Fri, Aug 15, 2014 at 12:11 PM, Sunjay Varma
On Aug 15, 2014 2:55 PM, "Łukasz Langa"
wrote: On Aug 15, 2014, at 11:43 AM, Sunjay Varma
wrote:
Hi all, Has the syntax for specifying type been fully decided on already?
Using brackets may confuse new Python programmers. (…) Other languages
use angle brackets to specify types.
I also agree that angle brackets would be nicer. Guido decided against it for pragmatic reasons: 1. angle brackets would create Python source code incompatible with any version lower than 3.5
I'm all for compatibility, but Python 3 already breaks compatibility with Python 2. Why not add this feature in Python 4 (or whatever the next breaking release is) and do it "right" the first time. I don't think it makes sense to start muddling up the different semantic meanings of Python's operations just because we think it will break in an older version.
There won't *be* a "next breaking release".
This is such a big and important change. It deserves its own syntax (and if necessary a new version number as well).
2. angle brackets would complicate the lexer (normally you expect < and to be spaced, in this case it wouldn’t) 3. angle brackets would require a new mechanism in Python to store this kind of expression within the type; this is still true for generics expressed with square brackets but at least you can use the existing nuts and bolts of Python classes
Angle brackets were just a suggestion as they are used frequently by other languages. Even braces would be more appropriate as they're already built into the lexer and dict{int, str} clearly means something different than dict[int, str].
All in all, this is more trouble than it’s worth.
I can understand that it's easier to use what's already there, but I don't agree with doing something just because it's easier. Especially when the side effects are not at all appealing.
I don't think you quite appreciate the art of language evolution. -- --Guido van Rossum (python.org/~guido)
On Aug 15, 2014, at 1:43 PM, Sunjay Varma
wrote: Using brackets may confuse new Python programmers. Since specifying type in Python is fairly new anyway, what do you all think of introducing angle brackets into Python instead? Other languages use angle brackets to specify types. It provides a good separation between type specification and list indexing.
Angle brackets already have meaning in Python, as comparison operators. The current surrounding operators ([], (), {}) require a matched pair in all cases. Breaking that rule would be confusing, though I know there are languages that do that.
I'm also worried that using square brackets will cause confusion as that notation is generally associated with array declarations in other languages. Even in Python, MyClass[int] may be confused with getting a key called int from some MyClass.
dict
seems to tell me more explicitly that I'm dealing with a declaration of an expected type. dict[str, int] looks very much like I'm getting an item (str, int) from some class.
Getting an item from a class has no meaning for any classes that I’ve ever used, and I haven’t come up with any hypothetical one that would want to do that. I think that the parallel between item access and item declaration is a great argument in favor of using the dict[str, int] (or perhaps dict[str: int]) syntax as a type declaration. Ryan
On 08/15/2014 11:56 AM, Ryan Hiebert wrote:
Getting an item from a class has no meaning for any classes that I’ve ever used, and I haven’t come up with any hypothetical one that would want to do that.
--> class Foo(Enum):
... spam = 'meat flavored'
... eggs = 'chicken by-product'
...
--> Foo
On Aug 15, 2014, at 12:06 PM, Ethan Furman
On 08/15/2014 11:56 AM, Ryan Hiebert wrote:
Getting an item from a class has no meaning for any classes that I’ve ever used, and I haven’t come up with any hypothetical one that would want to do that.
--> class Foo(Enum): ... spam = 'meat flavored' ... eggs = 'chicken by-product' ... --> Foo
--> Foo['spam']
I also thought of enums. Looks fairly innocent to me, though. Do you see any cases where the two would conflict? -- Best regards, Łukasz Langa WWW: http://lukasz.langa.pl/ Twitter: @llanga IRC: ambv on #python-dev
Another such example, since names are just names:
dict = {"a": 2}
print(dict["a"]) # 2
Overwriting a built in type name is bad, but entirely possible.
dict["a"] here is also confusing with dict[str]. This kind of use could
also potentially throw off a type linter too.
These probably aren't the best examples out there, but I can definitely see
this operator's meaning becoming very confused as more people start to
apply it in different ways.
We should not be just using something because it's there. Especially if it
causes other problems.
list[str] may be valid syntax in old Python 3 versions, but it's still not
going to be correct if used in those versions. You're going to get some
breakage no matter what.
This feature is very new to Python as a whole, why not give it a syntax
that provides a proper separation from what already was?
Sunjay
On Aug 15, 2014 3:14 PM, "Łukasz Langa"
On Aug 15, 2014, at 12:06 PM, Ethan Furman
wrote: On 08/15/2014 11:56 AM, Ryan Hiebert wrote:
Getting an item from a class has no meaning for any classes that I’ve ever used, and I haven’t come up with any hypothetical one that would want to do that.
--> class Foo(Enum): ... spam = 'meat flavored' ... eggs = 'chicken by-product' ... --> Foo
--> Foo['spam']
I also thought of enums. Looks fairly innocent to me, though. Do you see any cases where the two would conflict?
-- Best regards, Łukasz Langa
WWW: http://lukasz.langa.pl/ Twitter: @llanga IRC: ambv on #python-dev
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
This feature is very new to Python as a whole, why not give it a syntax that provides a proper separation from what already was?
Mainly because angle brackets are one of C++'s
http://stackoverflow.com/questions/7304699/c-templates-angle-brackets-proble...
worst http://blog.aaronballman.com/2011/05/semantic-whitespace/ mistakes
http://stackoverflow.com/questions/15785496/c-templates-angle-brackets-pitfa....
Well, worst among many other equally-worst things, but it's pretty bad. Say
goodbye to your simple LL1 parser!
On Fri, Aug 15, 2014 at 12:25 PM, Sunjay Varma
Another such example, since names are just names:
dict = {"a": 2} print(dict["a"]) # 2
Overwriting a built in type name is bad, but entirely possible.
dict["a"] here is also confusing with dict[str]. This kind of use could also potentially throw off a type linter too.
These probably aren't the best examples out there, but I can definitely see this operator's meaning becoming very confused as more people start to apply it in different ways.
We should not be just using something because it's there. Especially if it causes other problems. list[str] may be valid syntax in old Python 3 versions, but it's still not going to be correct if used in those versions. You're going to get some breakage no matter what.
This feature is very new to Python as a whole, why not give it a syntax that provides a proper separation from what already was?
Sunjay On Aug 15, 2014 3:14 PM, "Łukasz Langa"
wrote: On Aug 15, 2014, at 12:06 PM, Ethan Furman
wrote: On 08/15/2014 11:56 AM, Ryan Hiebert wrote:
Getting an item from a class has no meaning for any classes that I’ve ever used, and I haven’t come up with any hypothetical one that would want to do that.
--> class Foo(Enum): ... spam = 'meat flavored' ... eggs = 'chicken by-product' ... --> Foo
--> Foo['spam']
I also thought of enums. Looks fairly innocent to me, though. Do you see any cases where the two would conflict?
-- Best regards, Łukasz Langa
WWW: http://lukasz.langa.pl/ Twitter: @llanga IRC: ambv on #python-dev
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On Aug 15, 2014, at 2:06 PM, Ethan Furman
wrote: On 08/15/2014 11:56 AM, Ryan Hiebert wrote:
Getting an item from a class has no meaning for any classes that I’ve ever used, and I haven’t come up with any hypothetical one that would want to do that.
--> class Foo(Enum): ... spam = 'meat flavored' ... eggs = 'chicken by-product' ... --> Foo
--> Foo['spam']
Well thanks for that ;-) I don’t think it would conflict in this case, since I don’t think there’d be a reason to mix Enum item access with Type item access. Especially because the Enum itself might be used in the type signature (though not item access on the Enum, I’d think). I think it’s enough of an edge-case that I still like the parallel it provides, but I appreciate the reality check.
Ryan Hiebert wrote:
I don’t think it would conflict in this case, since I don’t think there’d be a reason to mix Enum item access with Type item access.
It would conflict if you somehow needed to define an Enum subclass with type parameters. I'm having trouble thinking of a use case for such a thing, though. -- Greg
On Fri, Aug 15, 2014 at 12:06 PM, Ethan Furman
On 08/15/2014 11:56 AM, Ryan Hiebert wrote:
Getting an item from a class has no meaning for any classes that I’ve ever used, and I haven’t come up with any hypothetical one that would want to do that.
--> class Foo(Enum): ... spam = 'meat flavored' ... eggs = 'chicken by-product' ... --> Foo
--> Foo['spam']
That's a little unfortunate, but I don't think it's harmful, as I don't expect a use case for parametrized enums. :-) -- --Guido van Rossum (python.org/~guido)
On Fri, Aug 15, 2014 at 8:43 PM, Sunjay Varma
Hi all, Has the syntax for specifying type been fully decided on already?
Using brackets may confuse new Python programmers. Since specifying type in Python is fairly new anyway, what do you all think of introducing angle brackets into Python instead? Other languages use angle brackets to specify types. It provides a good separation between type specification and list indexing.
I'm also worried that using square brackets will cause confusion as that notation is generally associated with array declarations in other languages. Even in Python, MyClass[int] may be confused with getting a key called int from some MyClass.
dict
seems to tell me more explicitly that I'm dealing with a declaration of an expected type. dict[str, int] looks very much like I'm getting an item (str, int) from some class. The angle bracket (or any other suggestions you have in mind) provides a more concrete separation between when we are performing item indexing and when we're specifying a type to validate.
Square brackets have the advantage of being valid Python now, so typed code would be backwards compatible. If the syntax was to change, what about a new operator? def sum(seq: Iterable of Number, start: Number): def print_grades(p: Mapping of (Student, Grade)): Just an idea.
On Fri, Aug 15, 2014 at 1:43 PM, Sunjay Varma
Using brackets may confuse new Python programmers. Since specifying type in Python is fairly new anyway, what do you all think of introducing angle brackets into Python instead?
Is this a facility which new programmers are likely to encounter right off the bat or is it going to mostly be buried from casual view? The use of paired angle brackets has been suggested over the years for other purposes in Python. I no longer recall the arguments against them, but ISTR issues with grammar complexity and syntax highlighting in editors. Still, since C++ somehow managed to use them that way, perhaps all the various tools which might be exposed to them have been fixed by now. Skip
Sunjay Varma wrote:
dict
seems to tell me more explicitly that I'm dealing with a declaration of an expected type.
In addition to my earlier objections to angle brackets,
there would be a big problem with parsing this notation
in Python.
In languages that use syntax like this, there is a clear
division between type descriptions and expressions --
they belong to completely separate areas of the grammar.
However, we need to be able to parse our type descriptions
as expressions, because they *are* expressions, just like
any other.
Now consider:
dict
Sunjay Varma wrote:
dict[str, int] looks very much like I'm getting an item (str, int) from some class.
If you consider that 'dict' on its own represents a set of possible types, then it's not unreasonable that 'dict[str, int]' selects one of the types from that set. In other words, dict is a lazy collection of types. -- Greg
Petr Viktorin wrote:
Instead of
def optional(t): return t | None
it's now:
def optional(t): if t is None: return t else: return t | None
I don't think so. Code that's manipulating types shouldn't be using None as a stand-in for NoneType in the first place. Think of it this way: None is *not* a type, just a special-case value of x in "<type> | x". So optional(None) is an error, just as much as optional(42) would be. -- Greg
On 15.08.2014 17:42, Petr Viktorin wrote:
The common use is not all that concise: def foo(bar: int | None=None): pass
Or alternatively it could be: def foo(bar: int=None): pass if the default was automatically allowed.
(Assuming you mean "the type of the default") While I like the second form a bit more, it kinda goes against "explicit is better than implicit". Also, if I change the default value from None to 42, I've either changed the allowable types, or need to remember to turn "bar: int=None" into "bar: int|None = 42". Furthermore, what should happen in the following case: # no annotations here def foo(): ... def bar(evil: int = foo()): ... Should this be disallowed, as the type checker will not be able to know what type foo is? Should it just assume int? And in the latter case, what should happen if foo now gains a "-> float" annotation?
On Fri, Aug 15, 2014 at 6:43 PM, Dennis Brakhane
On 15.08.2014 17:42, Petr Viktorin wrote:
The common use is not all that concise: def foo(bar: int | None=None): pass
Or alternatively it could be: def foo(bar: int=None): pass if the default was automatically allowed.
(Assuming you mean "the type of the default")
I really meant *only* the default. This really only works for None, but that's a good thing, since something like: def foo(bar:int=''): pass looks very suspicious. I'd be fine with the linter complaining about foo('hello'). Of course you can always do: def foo(bar: (int | str)=''): pass
While I like the second form a bit more, it kinda goes against "explicit is better than implicit".
Also, if I change the default value from None to 42, I've either changed the allowable types, or need to remember to turn "bar: int=None" into "bar: int|None = 42".
Furthermore, what should happen in the following case:
# no annotations here def foo(): ...
def bar(evil: int = foo()): ...
Should this be disallowed, as the type checker will not be able to know what type foo is? Should it just assume int? And in the latter case, what should happen if foo now gains a "-> float" annotation?
If your linter can't figure it out, just specify the default's type explicitly. Always a good thing to do when something's not immediately obvious.
So, if you need to change how many things you specify typing information
for depending on which parser you will use, how does putting those typing
information stub objects into the standard library advance anyone?
On 15 August 2014 19:11, Petr Viktorin
On Fri, Aug 15, 2014 at 6:43 PM, Dennis Brakhane
wrote: On 15.08.2014 17:42, Petr Viktorin wrote:
The common use is not all that concise: def foo(bar: int | None=None): pass
Or alternatively it could be: def foo(bar: int=None): pass if the default was automatically allowed.
(Assuming you mean "the type of the default")
I really meant *only* the default. This really only works for None, but that's a good thing, since something like: def foo(bar:int=''): pass looks very suspicious. I'd be fine with the linter complaining about foo('hello').
Of course you can always do: def foo(bar: (int | str)=''): pass
While I like the second form a bit more, it kinda goes against "explicit is better than implicit".
Also, if I change the default value from None to 42, I've either changed the allowable types, or need to remember to turn "bar: int=None" into "bar: int|None = 42".
Furthermore, what should happen in the following case:
# no annotations here def foo(): ...
def bar(evil: int = foo()): ...
Should this be disallowed, as the type checker will not be able to know what type foo is? Should it just assume int? And in the latter case, what should happen if foo now gains a "-> float" annotation?
If your linter can't figure it out, just specify the default's type explicitly. Always a good thing to do when something's not immediately obvious. _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Łukasz Langa wrote:
I’m also not a fan of the square brackets for generics (those brackets mean lookup!) but a BDFL once said that “language evolution is the art of compromise” and one cannot disagree with that.
I think there's sufficient precedent in other languages to justify using square brackets that way. I much prefer them over the angle brackets of C++ and Java, which are ugly to my eyes and make the code hard to read. They also make parsing tricky if you have "<<" and ">>" operators. -- Greg
PS. I realize I didn't discuss question (C) much. That's intentional -- we can now start discussing specific mypy features in separate
Many people have shown either support for the idea, or pointed to some other system that addresses the same issue. On the other hand, several
Am 15.08.2014 01:56, schrieb Guido van Rossum: threads (or in this one :-). So is this the place to discuss "the thornier issues brought up against mypy"? Because I think it's important that we get an idea of what we want mypy to be able to do and what not. After all, mypy will probably end up as a kind of reference implementation for static type checkers. And I'm worried there might be real damage to Python as a language if they aren't thought through: people have claimed
that they don't need it, or that they worry it will make Python less useful for them. (However, many of the detractors seem to have their own alternative proposal. :-) In the end I don't think we can ever know for sure -- but my intuition tells me that as long as we keep it optional, there is a real demand.
It won't be optional for programmers who work in a corporate environment where mypy happens to be required. Those teams will probably also use third party libraries, and they might want them to be "type safe" as well and file RFEs for it; in a few years, we might end up with a situation where "serious" code is expected to provide static type info. Even if mypy is "just a linter", a linter is supposed to find bugs and promote best practices. And if the Python reference doc somehow named mypy as an example for a static type checker, it will be probably seen as enforcing Python best practices. I really think there's a good chance/risk that mypy will change how Python programs are written in the future. For example, people wouldn't probably call the word_count method with a file object and write a test case, instead, they will read the file into a list and call it instead. After all, the linter would complain otherwise. IMO, the question is how much "staticness" we want to encourage. If we want a really useful and flexible static type checking system, we would need a very complex type system. If we'd go that route, I fear that one of Pythons main features, its dynamic nature will be seen by new programmers as kind of "deprecated legacy", and turning Python into some poor-man's-Scala. If we take the current mypy approach, the type system would probably end up a lot like Java's, useful for simple cases, and useless/a PITA otherwise. And if we make the system minimal by design, there's a good chance that it will be completely useless for static type checking for all but the most trivial cases, helping no one. I'm not sure which one I prefer, although I'm leaning into the minimal to Java level direction; that way, the type system might be "just bad enough" for people to see when dynamic typing is an advantage and using that instead. One thing where I do have a clear opinion is that it should/must be possible to override (bad) type annotations. Providing a seperate file seems ok to me. But the open problem with this approach is 1) how to tell IDEs and linters which overrides are in effect when and where (there might be different override for different modules) 2)how we should handle different versions of libraries As an example for 2), let's say I'm the author of the frobnicate library, which depends on spammatron 0.3 from pip. Let's also say that - because it's a good idea - mypy will check override modules for consistency: An override cannot declare a completely different signature than the original. Spammatro's author declared "def foo(x: float) -> float", but actually, it should have been "Number" instead of float, as I'm using Decimal. So I define a override module to fix it. Now spammatron 0.3.1 is released, and foo has gained an additional optional parameter: "def foo(x: float, y: float = 0) -> float". So I have to update my override module, but then my library isn't compatbile with 0.3 anymore; or at least will give linter errors. Providing both overrides would be very difficult; the static analyser would have to know which versions of the library it is using. I suppose it could look at the __version__ attribute, but what if it's missing?
On Fri, Aug 15, 2014 at 02:37:45PM +0200, Dennis Brakhane wrote:
I really think there's a good chance/risk that mypy will change how Python programs are written in the future.
I certainly hope so. We're wasting our time if it doesn't. Why go through all the time and effort if nobody uses it?
For example, people wouldn't probably call the word_count method with a file object and write a test case, instead, they will read the file into a list and call it instead. After all, the linter would complain otherwise.
How do you know the linter will complain? Many of the arguments against this proposal are based on the assumption that, given a type checker for Python, developers will suddenly abandon all the proven advantages of duck-typing and dynamic typing and rush to turn Python into a third-rate Java. I don't think this is a realistic fear. The mere fact we are having this argument proves that many Python developers will fight tooth and nail to keep using duck-typing and dynamic typing. If they do static type checks, they aren't going to give up those advantages. They'll check for Iterable, not list. And those who don't? Do the same thing you would do *right now* when those authors write code like this: if type(argument) is not list: raise TypeError("list expected") Report it as a bug, or request a feature enhancement. Patch the library. Use a different library. Or, *just don't use the linter*.
IMO, the question is how much "staticness" we want to encourage. If we want a really useful and flexible static type checking system, we would need a very complex type system. If we'd go that route, I fear that one of Pythons main features, its dynamic nature will be seen by new programmers as kind of "deprecated legacy", and turning Python into some poor-man's-Scala.
Static typing and dynamic typing are *not* opposites. The names are unfortunate, because they imply an opposition that doesn't necessarily exist. Both static and dynamic typing are attempts to solve certain problems in programming, and it is possible to do both at the same time. [...]
One thing where I do have a clear opinion is that it should/must be possible to override (bad) type annotations.
Why? Do you consider it a "must" to override functions that raise TypeError at run time? py> len(None) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: object of type 'NoneType' has no len() So why do you consider it a "must" to be able to override functions that report type errors at compile time? But for what it's worth, as this proposal has repeatedly said, the linter is optional. If you don't believe the linter, *just run the code* in Python like you have always done before. -- Steven
How do you know the linter will complain? Because a file object is not a list of strings, which is what word_count incorrectly declares it needs. (it actually requires a iterable of
On 15.08.2014 16:38, Steven D'Aprano wrote: things that can be split()ed and doesn't return a Dict[str, int] but a Dict[return_type_of_split(), int])
So why do you consider it a "must" to be able to override functions that report type errors at compile time?
The difference in that latter case is that the code runs perfectly fine and correctly. It's just that the linter implies you've done something bad. Using isinstance checks is kinda frowned upon, using type annotations will be probably considered to be totally acceptable practise (if not, what's the point of this proposal). It's much harder to argue with the original author that it's a bug in the latter case. Furthermore, the author might not want to loose the type requirement, because he doesn't want to guarantee those semantics, for example. If I'm willing to take that risk (and have a test case in my code) why shouldn't I be allowed to silence those errors in a simple way that doesn't require casting at every method call? To me, one of the things that sets Python apart from other languages is the fact that code will and can be used in a way the original author might not have thought of. Giving the original author of methods the means to dictate what types are acceptable and not with no clean and simple way of overriding it just feels Java-esque to me. (Again, isinstance checks are considered a bad practise, but I doubt declaring too restrictive types will be) I do not want to be forced to litter my code with casts, making it ugly and feel bad about what seems to me a reasonable method call.
But for what it's worth, as this proposal has repeatedly said, the linter is optional. If you don't believe the linter, *just run the code* in Python like you have always done before. As already said, I might be forced to run it because of company policy. I might contribute to a library that uses mypy.
I don't have a big problem with complying with strange code style requirements enforced by a linter ("in this project, all variables must begin with foobar and end with a number"), that's just naming. I do have a problem when a linter will de facto enforce rules like "all code must only call methods in the way the original author thought of", as this might lead to more ugly code because of workarounds/casts. Cheers, Dennis
On Fri, Aug 15, 2014 at 9:25 AM, Dennis Brakhane
How do you know the linter will complain? Because a file object is not a list of strings, which is what word_count incorrectly declares it needs. (it actually requires a iterable of
On 15.08.2014 16:38, Steven D'Aprano wrote: things that can be split()ed and doesn't return a Dict[str, int] but a Dict[return_type_of_split(), int])
So why do you consider it a "must" to be able to override functions that report type errors at compile time?
The difference in that latter case is that the code runs perfectly fine and correctly. It's just that the linter implies you've done something bad.
Using isinstance checks is kinda frowned upon, using type annotations will be probably considered to be totally acceptable practise (if not, what's the point of this proposal). It's much harder to argue with the original author that it's a bug in the latter case. Furthermore, the author might not want to loose the type requirement, because he doesn't want to guarantee those semantics, for example. If I'm willing to take that risk (and have a test case in my code) why shouldn't I be allowed to silence those errors in a simple way that doesn't require casting at every method call?
To me, one of the things that sets Python apart from other languages is the fact that code will and can be used in a way the original author might not have thought of.
Giving the original author of methods the means to dictate what types are acceptable and not with no clean and simple way of overriding it just feels Java-esque to me. (Again, isinstance checks are considered a bad practise, but I doubt declaring too restrictive types will be)
I do not want to be forced to litter my code with casts, making it ugly and feel bad about what seems to me a reasonable method call.
But for what it's worth, as this proposal has repeatedly said, the linter is optional. If you don't believe the linter, *just run the code* in Python like you have always done before. As already said, I might be forced to run it because of company policy. I might contribute to a library that uses mypy.
I don't have a big problem with complying with strange code style requirements enforced by a linter ("in this project, all variables must begin with foobar and end with a number"), that's just naming.
I do have a problem when a linter will de facto enforce rules like "all code must only call methods in the way the original author thought of", as this might lead to more ugly code because of workarounds/casts.
This attitude will not help you when interviewing at such a company. Did it occur to you that there might actually be a good reason for the lint rule, and that you, as a new hire, might not yet be aware of that reason? -- --Guido van Rossum (python.org/~guido)
On Fri Aug 15 2014 at 5:46:12 AM Dennis Brakhane
Am 15.08.2014 01:56, schrieb Guido van Rossum:
PS. I realize I didn't discuss question (C) much. That's intentional -- we can now start discussing specific mypy features in separate threads (or in this one :-).
So is this the place to discuss "the thornier issues brought up against mypy"? Because I think it's important that we get an idea of what we want mypy to be able to do and what not. After all, mypy will probably end up as a kind of reference implementation for static type checkers. And I'm worried there might be real damage to Python as a language if they aren't thought through:
I'm not concerned about that myself. If this syntax doesn't work out, the the existing status quo prevails and the syntax is ignored by other tools that need more than it can provide. That still leads to the same feedback cycle we already have today such that the language syntax for type annotations can evolve and improve again in the future. [Caution: I'm commenting about mypy below after having spent less than 10 minutes looking at its website to pretend I know what it can and can't do already. Assume I'm wrong.] ie: there are things I don't know about mypy. Does it have the ability to specify that the return type of "def foo(A, B)" is the same type as whatever the caller passed in for parameter B? That is a pretty common thing in Python. Even if it doesn't have it today, I suspect it can be added in the future. There are other things mypy didn't appear to deal with at first glance either, specific sets of possible inputs -> outputs rather than always listing inputs as a union of all possible types for that parameter and outputs as a union of all possible types to be output. I may well be wrong about the above. But even if I'm not, I'm not worried. Deeper analysis and annotation tools will simply do what they are already doing: plowing on ahead with their own extended annotation format. I understand your concerns (in the rest of our message that i've elided) but I think a "try it and see" approach will actually work here. Libraries are already released where people have gone overboard with incorrect overly strict isinstance or issubclass checks. Those are _worse_ than something that merely lists overly strict types as you literally cannot use them without modifying the code or complying. Something any code analyzer implementation needs is an ability to be told "ignore this module, it's full of crap." as part of its analysis process. :) -gps
On 15.08.2014 18:35, Gregory P. Smith wrote:
Libraries are already released where people have gone overboard with incorrect overly strict isinstance or issubclass checks. Those are _worse_ than something that merely lists overly strict types as you literally cannot use them without modifying the code or complying.
That still leads to the same feedback cycle we already have today such
type annotations can evolve and improve again in the future.
Does it have the ability to specify that the return type of "def foo(A, B)" is the same type as whatever the caller passed in for parameter B? That is a pretty common thing in Python. Even if it doesn't have it today, I suspect it can be added in the future. That's a good example of what I'm worried about: to be really useful,
I agree.
that the language syntax for
the type declarations have to be really flexible. My (maybe irrational)
fear is that we end up with a turing complete type system.
If a programmers needs to spend 10 minutes thinking about how exactly he
has to declare his method parameters or "class interfaces" (does my
container type behave covariantly or contravariantly?), and all that
just so his code passes a linter, something went wrong ;)
We shouldn't try to make the signatures carry all information needed for
type checkers, if some things can only be found out by code analysis,
that's fine by me.
For example, Jedi can handle the following silly example:
def foo():
return [os, sys]
x = foo()
Jedi knows that x[1] is sys and will only propose members of sys, and
only os members for x[0]. And the linter of the current development
version will actually barf on x[1].walk():
/tmp/a.py:12:5: E1 AttributeError:
I'm late, but it's also quite hard to read 200 mails in two days
(BlaBlaOverflow) :) I've read most of it. I'm also top-posting. Sorry :)
This mail is going to be way shorter this way.
*EuroPython*
I'm a core-dev of Jedi https://github.com/davidhalter/jedi (the
autocompletion library). We were having discussions
https://github.com/davidhalter/jedi/issues/170 about using annotations
for a longer time, but we couldn't decide on a good approach. This has
changed quite drastically at EuroPython were I realized that Jedi is not
alone in wanting annotations to become a way of specifying types. Most
prominently the some of the pylint guys and Andrey of PyCharm seemed
interested to standardize type annotations. This would have led me to
eventually write a mail here on Python ideas (probably together with those
mentioned), describing our needs. Furthermore there were other interested
parties like Cython and Numba devs, because they might eventually use
annotations to improve their type knowledge.
Conclusion of the conference is that Python's static analysis community
wants type annotations. We want them standardized to be actually able to
use them. A third party library would be useful only if it had the same
opportunities like tulip: Being a third party library with the clear goal
of inclusion in the standard library.
*The current proposal*
While I don't fully support mypy's type system, I think it's a step into
the right direction, but I don't think we need it now (BTW: Mypy's
``Protocol`` class is genius and should be adapted into the stdlib anyway).
In the Appendix below, I've described what I came up with. I think my
solution inferior in capabilities, but way easier to understand and without
clutter. Something like Mypy's `typing` should probably be adopted and
standardized in a separate PEP, possibly for 3.6, once we have a few first
experiences. (Maybe if we start early it could still make 3.5).
I think a resulting PEP of this discussion should contain a deprecation
note for all usages other than type checking in Python 3.5. The current
ambiguous nature of annotations is the fact why no static analysis tool
ever checked them.
One thing you have to remember at this point, *that the two biggest issues
for static analysis are builtins and functions that are never called.
Function annotations would solve both partially.*
*Argument Clinic*
One notable side effect of Guido's proposal would be that Argument Clinic
could use annotations in its signatures. This would be a big benefit for
static analysis. because it would finally reveal the types of builtin
functions (input and output!). I tried to convince Larry Hastings to
implement that, but he refused, citing PEP 3107 which states that "This
work will be left to third-party libraries.".
Argument Clinic in combination with type annotations would be a huge win
for the static analysis community.
*Issues*
By far my biggest concern is the fact that type annotations in CPython
don't have any effect on run-time or compile-time. This is really an issue,
because people from other languages will actually think that this is a type
checker. This could be fixed partially by checking annotations at run-time
(and raising warnings [or exceptions with a command line switch]). This way
annotations wouldn't be without semantic meaning, they would actually be
some kind of pre/post conditions. However, this should also not be the goal
of a current PEP. I just wanted to mention it, so that we can keep it in
mind.
*Appendix (My Proposal)*
My proposal (as discussed and evolved with a few good people at EuroPython)
for containers would look something like this:
def foo(index: int, x: [float], y: {int: str}) -> (float, str):
return x[index], y[index]
The "default" containers (set, list, dict and tuple) would just serve as a
way of specifying containers. This makes a lot of things less complicated:
- It's easier to understand (everybody knows builtin types). It also feels
natural to me. The example above covers also almost all variations. A tuple
could be expanded by using the ellipsis: `(int, object, ..., float)`.
- No imports needed. People are more likely to use it, if they don't have
to import typing all the time. This is important for static analysis,
people are only likely to use it if it's easier than writing docstrings
with type information.
- Argument clinic could use the same. The standard library quite often
doesn't accept abstract data types, btw.
- It's what people sometimes use in docstrings: ``@rtype: (float, str)`` or
``:rtype: (float, str)``.
I know this doesn't solve the duck typing issue, but if you look at
real-life Python code bases, there are very few instances of actually
implementing a ``Mapping``, etc. For all of that we should write a separate
proposal (with abstract type classes). My proposal is also missing a Union
type and a few other things, but I'd rather start type annotations really
simple and soon than to wait for good libraries to emerge.
I think the typing module (without my additions) would be
counterproductive, because it's just too complicated too understand. Most
people that are using Python don't know ABCs and would have a hard time
dealing with such optional typing. In the end most people would just ignore
it entirely.
Godspeed!
~ Dave
2014-08-15 1:56 GMT+02:00 Guido van Rossum
I have read pretty much the entire thread up and down, and I don't think I can keep up with responding to every individual piece of feedback. (Also, a lot of responses cancel each other out. :-)
I think there are three broad categories of questions to think about next.
(A) Do we even need this?
(B) What syntax to use?
(C) Does/should it support <feature X>?
Taking these in turn:
(A) Do we even need a standard for optional static typing?
Many people have shown either support for the idea, or pointed to some other system that addresses the same issue. On the other hand, several people have claimed that they don't need it, or that they worry it will make Python less useful for them. (However, many of the detractors seem to have their own alternative proposal. :-)
In the end I don't think we can ever know for sure -- but my intuition tells me that as long as we keep it optional, there is a real demand. In any case, if we don't start building something we'll never know whether it'll be useful, so I am going to take a leap of faith and continue to promote this idea.
I am going to make one additional assumption: the main use cases will be linting, IDEs, and doc generation. These all have one thing in common: it should be possible to run a program even though it fails to type check. Also, adding types to a program should not hinder its performance (nor will it help :-).
(B) What syntax should a standard system for optional static typing use?
There are many interesting questions here, but at the highest level there are a few choices that constrain the rest of the discussion, and I'd like to start with these. I see three or four "families" of approaches, and I think the first order is to pick a family.
(1) The mypy family. (http://mypy-lang.org/) This is characterized by its use of PEP 3107 function annotations and the constraint that its syntax must be valid (current) Python syntax that can be evaluated without errors at function definition time. However, mypy also supports collecting annotations in separate "stub" files; this is how it handles annotations for the stdlib and C extensions. When mypy annotations occur inline (not in a stub file) they are used to type check the body of the annotated function as well as input for type checking its callers.
(2) The pytypedecl family. (https://github.com/google/pytypedecl) This is a custom syntax that can only be used in separate stub files. Because it is not constrained by Python's current syntax, its syntax is slightly more elegant than mypy.
(3) The PyCharm family. ( http://www.jetbrains.com/pycharm/webhelp/using-docstrings-to-specify-types.h...) This is a custom syntax that lives entirely in docstrings. There is also a way to use stub files with this. (In fact, every viable approach has to support some form of stub files, if only to describe signatures for C extensions.)
(I suppose we could add a 4th family that puts everything in comments, but I don't think anyone is seriously working on such a thing, and I don't see any benefits.)
There's also a variant of (1) that Łukasz Langa would like to see -- use the syntactic position of function annotations but using a custom syntax (e.g. one similar to the pytypedecl syntax) that isn't evaluated at function-definition time. This would have to use "from __future__ import <something>" for backward compatibility. I'm skeptical about this though; it is only slightly more elegant than mypy, and it would open the floodgates of unconstrained language design.
So how to choose? I've read passionate attacks and defenses of each approach. I've got a feeling that the three projects aren't all that different in maturity (all are well beyond the toy stage, none are quite ready for prime time). In terms of specific type system features (e.g. forward references, generic types, duck typing) I expect they are all acceptable, and all probably need some work (and there's no reason to assume that work can't be done). All support stubs so you can specify signatures for code you can't edit (whether C extension, stdlib or just opaque 3rd party code).
To me there is no doubt that (1) is the most Pythonic approach. When we discussed PEP 3107 (function annotations) it was always my goal that these would eventually be used for type annotations. There was no consensus at the time on what the rules for type checking should be, but their syntactic position was never in doubt. So we decided to introduce "annotations" in Python 3 in the hope that 3rd party experiments would eventually produce something satisfactory. Mypy is one such experiment. One of the important lessons I draw from mypy is that type annotations are most useful to linters, and should (normally) not be used to enforce types at run time. They are also not useful for code generation. None of that was obvious when we were discussing PEP 3107!
I don't buy the argument that PEP 3107 promises that annotations are completely free of inherent semantics. It promises compatibility, and I take that very seriously, but I think it is reasonable to eventually deprecate other uses of annotations -- there aren't enough significant other uses for them to warrant crippling type annotations forever. In the meantime, we won't be breaking existing use of annotations -- but they may confuse a type checker, whether a stand-alone linter like mypy or built into an IDE like PyCharm, and that may serve as an encouragement to look for a different solution.
Most of the thornier issues brought up against mypy wouldn't go away if we adopted another approach: whether to use concrete or abstract types, the use of type variables, how to define type equivalence, the relationship between a list of ints and a list of objects, how to spell "something that implements the buffer interface", what to do about JSON, binary vs. text I/O and the signature of open(), how to check code that uses isinstance(), how to shut up the type checker when you know better... The list goes on. There will be methods whose type signature can't be spelled (yet). There will be code distributed with too narrowly defined types. Some programmers will uglify their code to please the type checker.
There are questions about what to do for older versions of Python. I find mypy's story here actually pretty good -- the mypy codec may be a hack, but so is any other approach. Only the __future__ approach really loses out here, because you can't add a new __future__ import to an old version.
So there you have it. I am picking the mypy family and I hope we can start focusing on specific improvements to mypy. I also hope that somebody will write converters from pytypedecl and PyCharm stubs into mypy stubs, so that we can reuse the work already put into stub definitions for those two systems. And of course I hope that PyCharm and pytypedecl will adopt mypy's syntax (initially in addition to their native syntax, eventually as their sole syntax).
PS. I realize I didn't discuss question (C) much. That's intentional -- we can now start discussing specific mypy features in separate threads (or in this one :-).
-- --Guido van Rossum (python.org/~guido)
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On Aug 16, 2014, at 6:00, Dave Halter
Appendix (My Proposal)
My proposal (as discussed and evolved with a few good people at EuroPython) for containers would look something like this:
def foo(index: int, x: [float], y: {int: str}) -> (float, str): return x[index], y[index]
The "default" containers (set, list, dict and tuple) would just serve as a way of specifying containers. This makes a lot of things less complicated:
- It's easier to understand (everybody knows builtin types). It also feels natural to me. The example above covers also almost all variations. A tuple could be expanded by using the ellipsis: `(int, object, ..., float)`. - No imports needed. People are more likely to use it, if they don't have to import typing all the time. This is important for static analysis, people are only likely to use it if it's easier than writing docstrings with type information. - Argument clinic could use the same. The standard library quite often doesn't accept abstract data types, btw. - It's what people sometimes use in docstrings: ``@rtype: (float, str)`` or ``:rtype: (float, str)``.
I know this doesn't solve the duck typing issue, but if you look at real-life Python code bases, there are very few instances of actually implementing a ``Mapping``, etc.
For "Mapping", maybe (although anyone who uses some form of tree-based mapping, either to avoid hash-collision attacks or because he needs sorting, may disagree). But for "etc.", people implement them all the time. Especially "Iterable" and "Callable". And, even some of the ones that people don't implement often, they use often, implicitly or otherwise, like TextIOBase. Anything that encourages people to restrict their code to only working on lists instead of iterables would be a huge step backward to Python 2.2. And your argument for it ("everybody knows builtin types") implies that's exactly what you're expecting with this proposal. However, there's an easy way around this: just let [spam] in your syntax mean what Iterable[spam] means in MyPy's. If someone really needs to declare that they will only accept a list or whatever, that's the uncommon case, and can be written more verbosely. And I don't see any reason why this can't be added on top of genericizing the ABCs a la MyPy. Meanwhile, your tuple doesn't fit the same pattern as the others, because it's explicitly fixed-size and heterogeneous. And I think this is a good thing. And I think it's pretty close to what MyPy does with an expression list of types already; if not, it seems like what MyPy _should_ do. If I loop over a zip of a [str] and an [int], the loop variable is a (str, int), not a Tuple[str or int]. So, making it easier to specify generic builtins is a bad idea, but using builtins to make it easier to specify common types is a great idea.
On Sat Aug 16 2014 at 10:38:34 AM Andrew Barnert
On Aug 16, 2014, at 6:00, Dave Halter
wrote: *Appendix (My Proposal)*
My proposal (as discussed and evolved with a few good people at EuroPython) for containers would look something like this:
def foo(index: int, x: [float], y: {int: str}) -> (float, str): return x[index], y[index]
The "default" containers (set, list, dict and tuple) would just serve as a way of specifying containers. This makes a lot of things less complicated:
- It's easier to understand (everybody knows builtin types). It also feels natural to me. The example above covers also almost all variations. A tuple could be expanded by using the ellipsis: `(int, object, ..., float)`. - No imports needed. People are more likely to use it, if they don't have to import typing all the time. This is important for static analysis, people are only likely to use it if it's easier than writing docstrings with type information. - Argument clinic could use the same. The standard library quite often doesn't accept abstract data types, btw. - It's what people sometimes use in docstrings: ``@rtype: (float, str)`` or ``:rtype: (float, str)``.
I know this doesn't solve the duck typing issue, but if you look at real-life Python code bases, there are very few instances of actually implementing a ``Mapping``, etc.
For "Mapping", maybe (although anyone who uses some form of tree-based mapping, either to avoid hash-collision attacks or because he needs sorting, may disagree).
But for "etc.", people implement them all the time. Especially "Iterable" and "Callable". And, even some of the ones that people don't implement often, they use often, implicitly or otherwise, like TextIOBase.
Anything that encourages people to restrict their code to only working on lists instead of iterables would be a huge step backward to Python 2.2. And your argument for it ("everybody knows builtin types") implies that's exactly what you're expecting with this proposal.
However, there's an easy way around this: just let [spam] in your syntax mean what Iterable[spam] means in MyPy's. If someone really needs to declare that they will only accept a list or whatever, that's the uncommon case, and can be written more verbosely. And I don't see any reason why this can't be added on top of genericizing the ABCs a la MyPy.
Meanwhile, your tuple doesn't fit the same pattern as the others, because it's explicitly fixed-size and heterogeneous. And I think this is a good thing. And I think it's pretty close to what MyPy does with an expression list of types already; if not, it seems like what MyPy _should_ do. If I loop over a zip of a [str] and an [int], the loop variable is a (str, int), not a Tuple[str or int].
So, making it easier to specify generic builtins is a bad idea, but using builtins to make it easier to specify common types is a great idea.
The trick in all of this is making sure people instinctively know what the builtin types represent in terms of an interface w/o necessarily over-specifying. For instance, a list could be viewed as MutableSequence when all that is really necessary is Sequence or Iterable. Just think of those situations where a list or tuple both work as an argument; how do you specify that without assuming mutability? It's tricky to figure out what a proper assumption of what the built-ins represent should be.
On Aug 16, 2014, at 8:02, Brett Cannon
On Sat Aug 16 2014 at 10:38:34 AM Andrew Barnert
wrote: On Aug 16, 2014, at 6:00, Dave Halter
wrote: Appendix (My Proposal)
My proposal (as discussed and evolved with a few good people at EuroPython) for containers would look something like this:
def foo(index: int, x: [float], y: {int: str}) -> (float, str): return x[index], y[index]
The "default" containers (set, list, dict and tuple) would just serve as a way of specifying containers. This makes a lot of things less complicated:
- It's easier to understand (everybody knows builtin types). It also feels natural to me. The example above covers also almost all variations. A tuple could be expanded by using the ellipsis: `(int, object, ..., float)`. - No imports needed. People are more likely to use it, if they don't have to import typing all the time. This is important for static analysis, people are only likely to use it if it's easier than writing docstrings with type information. - Argument clinic could use the same. The standard library quite often doesn't accept abstract data types, btw. - It's what people sometimes use in docstrings: ``@rtype: (float, str)`` or ``:rtype: (float, str)``.
I know this doesn't solve the duck typing issue, but if you look at real-life Python code bases, there are very few instances of actually implementing a ``Mapping``, etc.
For "Mapping", maybe (although anyone who uses some form of tree-based mapping, either to avoid hash-collision attacks or because he needs sorting, may disagree).
But for "etc.", people implement them all the time. Especially "Iterable" and "Callable". And, even some of the ones that people don't implement often, they use often, implicitly or otherwise, like TextIOBase.
Anything that encourages people to restrict their code to only working on lists instead of iterables would be a huge step backward to Python 2.2. And your argument for it ("everybody knows builtin types") implies that's exactly what you're expecting with this proposal.
However, there's an easy way around this: just let [spam] in your syntax mean what Iterable[spam] means in MyPy's. If someone really needs to declare that they will only accept a list or whatever, that's the uncommon case, and can be written more verbosely. And I don't see any reason why this can't be added on top of genericizing the ABCs a la MyPy.
Meanwhile, your tuple doesn't fit the same pattern as the others, because it's explicitly fixed-size and heterogeneous. And I think this is a good thing. And I think it's pretty close to what MyPy does with an expression list of types already; if not, it seems like what MyPy _should_ do. If I loop over a zip of a [str] and an [int], the loop variable is a (str, int), not a Tuple[str or int].
So, making it easier to specify generic builtins is a bad idea, but using builtins to make it easier to specify common types is a great idea.
The trick in all of this is making sure people instinctively know what the builtin types represent in terms of an interface w/o necessarily over-specifying. For instance, a list could be viewed as MutableSequence when all that is really necessary is Sequence or Iterable. Just think of those situations where a list or tuple both work as an argument; how do you specify that without assuming mutability? It's tricky to figure out what a proper assumption of what the built-ins represent should be.
Honestly, I think [str] is the only case where this is an important question, so let's think about that rather than trying to think about a more general and abstract problem. I'm not sure whether, when people say they need a list of strings, they more often mean Iterable[str] rather than Sequence[str] or MutableSequence[str]. I suspect it's the former, but I can't prove it, and I also suspect it's more of a 70/10/20 case than a 98/1/1 case. So maybe that argues that [str] just shouldn't be allowed. But if it meant Iterable[str], someone who used it when they needed a Sequence or MutableSequence would get an error when they tried to MyPy their library, app, whatever. And it wouldn't be that hard to make that error explain the problem to them--the same way clang tries to explain template errors in C++ and suggest fixes, except that it would be orders of magnitude easier. "spam is an iterable, so you can't assign to its indexes. Did you mean to declare it as MutableSequence[str]?" On the other hand, if [str] meant MutableSequence[str] (or list[str]), the author who used it on a function that did nothing but loop over spam would not get an error; he'd have to wait until he published his code and someone filed a bug report saying "You declared spam as a MutableSequence, even though all you do is loop over it, and now my code that worked correctly with your spam-1.7, and that still works correctly with spam-1.8 if I don't use static checking, fails the linter. Did you mean to declare it as Iterable[str]?" This obviously isn't a slam-sunk argument. If MutableSequence were used far more often than Iterable, or were more pythonic in some way, then it would make sense for [str] to mean MutableSequence despite the fact that it puts the errors in the less convenient place. But if they're both reasonably common, I think this argues for making it mean Iterable. For the last part:
Just think of those situations where a list or tuple both work as an argument; how do you specify that without assuming mutability?
That one's easy: you write Sequence. A lot of similar questions were already answered when abc, Number, collections.abc, and io were designed, and they did a great job answering some tricky questions--which is exactly why I think static typing should use those already-worked-out cases rather than trying to answer all those questions again with a parallel type hierarchy. (And if use of static typing leads people to realize that one of those ABCs got something wrong, better to fix that bug in the ABC than to leave it incorrect and make the typing type different.)
2014-08-16 16:30 GMT+02:00 Andrew Barnert
On Aug 16, 2014, at 6:00, Dave Halter
wrote: *Appendix (My Proposal)*
My proposal (as discussed and evolved with a few good people at EuroPython) for containers would look something like this:
def foo(index: int, x: [float], y: {int: str}) -> (float, str): return x[index], y[index]
The "default" containers (set, list, dict and tuple) would just serve as a way of specifying containers. This makes a lot of things less complicated:
- It's easier to understand (everybody knows builtin types). It also feels natural to me. The example above covers also almost all variations. A tuple could be expanded by using the ellipsis: `(int, object, ..., float)`. - No imports needed. People are more likely to use it, if they don't have to import typing all the time. This is important for static analysis, people are only likely to use it if it's easier than writing docstrings with type information. - Argument clinic could use the same. The standard library quite often doesn't accept abstract data types, btw. - It's what people sometimes use in docstrings: ``@rtype: (float, str)`` or ``:rtype: (float, str)``.
I know this doesn't solve the duck typing issue, but if you look at real-life Python code bases, there are very few instances of actually implementing a ``Mapping``, etc.
For "Mapping", maybe (although anyone who uses some form of tree-based mapping, either to avoid hash-collision attacks or because he needs sorting, may disagree).
But for "etc.", people implement them all the time. Especially "Iterable" and "Callable". And, even some of the ones that people don't implement often, they use often, implicitly or otherwise, like TextIOBase.
Anything that encourages people to restrict their code to only working on lists instead of iterables would be a huge step backward to Python 2.2. And your argument for it ("everybody knows builtin types") implies that's exactly what you're expecting with this proposal.
However, there's an easy way around this: just let [spam] in your syntax mean what Iterable[spam] means in MyPy's. If someone really needs to declare that they will only accept a list or whatever, that's the uncommon case, and can be written more verbosely. And I don't see any reason why this can't be added on top of genericizing the ABCs a la MyPy.
Meanwhile, your tuple doesn't fit the same pattern as the others, because it's explicitly fixed-size and heterogeneous. And I think this is a good thing. And I think it's pretty close to what MyPy does with an expression list of types already; if not, it seems like what MyPy _should_ do. If I loop over a zip of a [str] and an [int], the loop variable is a (str, int), not a Tuple[str or int].
So, making it easier to specify generic builtins is a bad idea, but using builtins to make it easier to specify common types is a great idea.
Very interesting ideas. I'm not sure if this is the right way to go, but it's definitely an option. It could confuse people, if they cannot use `isinstance(x, list)` anymore. But then again it would just be a really neat way of expressing abstract types. One other confusion might come from the fact that `-> list` would not be an abstract data type, but `[str]` will. It's ok, but also not very obvious. I've also thought about adding something like an `-> abstract([abstract(str)])` which would return the abstract data types for builtins.
On Sat, Aug 16, 2014 at 03:00:15PM +0200, Dave Halter wrote:
I think a resulting PEP of this discussion should contain a deprecation note for all usages other than type checking in Python 3.5.
Can you explain your reasoning for this? I understand that unless they are deliberately built to cooperate, two users of annotations are likely to interfere with each other. The jedi tool wants annotations to be type information; the sith tool wants annotations to be X-Face pictures of kittens. They can't both get what they want. You are asking for alternative uses of annotations, such as pictures of kittens, to be deprecated. But it seems to me that all you really need is some standard way for jedi to look at the function and cheaply see that it shouldn't try interpreting the annotations as types. (And, mutatis mutandis, the same applies to sith.) That allows jedi and sith to co-exist, although we can't use both on the same function at the same time. I think it is fair for Python to standardise on type checking as the default semantics of annotations, but I would like to see a way to opt-out and still use annotations for other purposes.
The current ambiguous nature of annotations is the fact why no static analysis tool ever checked them.
mypy does. Hence this proposal. So does PyCharm: http://www.jetbrains.com/pycharm/webhelp/type-hinting-in-pycharm.html On the other hand, in this thread we've heard from two others who use function annotations for purposes other than types. So it seems to me that usage of annotations is split right down the middle between typing and non-typing. -- Steven
As a test case for what code may soon look like, here's a bit from one of my code bases: -------------------------------------------------------------------- class ACHPayment(object): """A single payment from company to a vendor.""" def __init__(self, description, sec_code, vendor_name, vendor_inv_num, vendor_rtng, vendor_acct, transaction_code, vendor_acct_type, amount, payment_date): """ description: 10 chars sec_code: 'CCD' or 'CTX' vendor_name: 22 chars vendor_inv_num: 15 chars vendor_rtng: 9 chars vendor_acct: 17 chars transaction_code: ACH_ETC code (enum) vendor_acct_type: 'domestic' or 'foreign' amount: 10 digits (pennies) payment_date: date payment should occur on (datetime.date type class) """ -------------------------------------------------------------------- The question: what would this look like with type annotations? As a point of interest, the last parameter, payment_date, can be /anything/ that quacks like a datetime.date -- I tend to use my own dbf.Date class, which subclasses object, not datetime.date itself. -- ~Ethan~
Maybe that point of interest could be solved by using some kind of type
class/interface(in the Obj C/Java sense). That way, external types that the
user has no control of can be added to the interface.
On Sat, Aug 16, 2014 at 3:22 PM, Ethan Furman
As a test case for what code may soon look like, here's a bit from one of my code bases:
-------------------------------------------------------------------- class ACHPayment(object): """A single payment from company to a vendor."""
def __init__(self, description, sec_code, vendor_name, vendor_inv_num, vendor_rtng, vendor_acct, transaction_code, vendor_acct_type, amount, payment_date): """ description: 10 chars sec_code: 'CCD' or 'CTX' vendor_name: 22 chars vendor_inv_num: 15 chars vendor_rtng: 9 chars vendor_acct: 17 chars transaction_code: ACH_ETC code (enum) vendor_acct_type: 'domestic' or 'foreign' amount: 10 digits (pennies) payment_date: date payment should occur on (datetime.date type class) """ --------------------------------------------------------------------
The question: what would this look like with type annotations? As a point of interest, the last parameter, payment_date, can be /anything/ that quacks like a datetime.date -- I tend to use my own dbf.Date class, which subclasses object, not datetime.date itself.
-- ~Ethan~
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
-- Ryan If anybody ever asks me why I prefer C++ to C, my answer will be simple: "It's becauseslejfp23(@#Q*(E*EIdc-SEGFAULT. Wait, I don't think that was nul-terminated."
On Aug 16, 2014, at 13:48, Ryan Gonzalez
Maybe that point of interest could be solved by using some kind of type class/interface(in the Obj C/Java sense). That way, external types that the user has no control of can be added to the interface.
We already have that. ABCs are enough like Java interfaces, ObjC mandatory protocols, C++ (non-auto) concepts, etc. to do everything we need here (except genericity, which it seems like everyone agrees with adding) if you want to do it nominatively. You just need to write a Date ABC. Or argue that there should be a datetime.abc library in the stdlib that does it for you. And that's simple. And if you want to do it structurally, like Go protocols, C++ auto concepts, ObjC optional protocols, etc., ABCs can also do that. It's not _quite_ as simple today, but it's not hard, and there are a half dozen libraries that make it easy (I wrote one in a couple hours, and didn't bother publishing it because a quick search turned up so many pre-existing alternatives, and at least three classes in the stdlib that just did it manually without help...). Of course the existing implementations don't give you a way to statically declare the types of method arguments, attribute/properties, etc., but MyPy.Protocol does, and that can easily be adopted into the stdlib as part of this proposal. (In fact, I think it's already on the list.) So, for Ethan's case, the last argument is just "payment_date: datetime.abc.Date", except that nobody has added that to the stdlib yet, so instead he has to write it himself and use it.
On Sat, Aug 16, 2014 at 01:22:48PM -0700, Ethan Furman wrote:
As a test case for what code may soon look like, here's a bit from one of my code bases:
-------------------------------------------------------------------- class ACHPayment(object): """A single payment from company to a vendor."""
def __init__(self, description, sec_code, vendor_name, vendor_inv_num, vendor_rtng, vendor_acct, transaction_code, vendor_acct_type, amount, payment_date): """ description: 10 chars sec_code: 'CCD' or 'CTX' vendor_name: 22 chars vendor_inv_num: 15 chars vendor_rtng: 9 chars vendor_acct: 17 chars transaction_code: ACH_ETC code (enum) vendor_acct_type: 'domestic' or 'foreign' amount: 10 digits (pennies) payment_date: date payment should occur on (datetime.date type class) """ --------------------------------------------------------------------
The question: what would this look like with type annotations? As a point of interest, the last parameter, payment_date, can be /anything/ that quacks like a datetime.date -- I tend to use my own dbf.Date class, which subclasses object, not datetime.date itself.
I don't think this is a shining example of the value of static typing, at least not by default. As I see it, you would get something like this: def __init__(self, description:str, sec_code:str, vendor_name:str, vendor_inv_num:str, vendor_rtng:str, vendor_acct:str, transaction_code:str, vendor_acct_type:str, amount:int, payment_date:Any)->None: which may not give you much additional value. In this case, I think that the static checks will add nothing except (perhaps) allow you to forgo writing a few isinstance checks. You still have to check that the strings are the right length, and so on. But if you're willing to invest some time creating individual str subclasses, you can push the length checks into the subclass constructor, and write something like this: def __init__(self, description:Str10, sec_code:SecurityCode, vendor_name:Str22, vendor_inv_num:Str15, vendor_rtng:Str9, vendor_acct:Str17, transaction_code:ACH_ETC, vendor_acct_type:VendorAcctType, amount:Pennies, payment_date:DateABC)->None: Without knowing your application in detail, it is difficult to know how much work you should hand over to the type system, and how much you should continue to do in Python. If all you're doing is pushing strings from one place to another, you might not care exactly how long the string is, say because they're truncated when you print them. -- Steven
I'd like to summarize the main issues that have come up. As an experiment, I'm not changing the subject, but I am still not quoting anything in particular. Only two issues (or issue clusters) really seem contentious: (1) Should the function annotation syntax (eventually) be reserved for type annotations in a standard syntax? Or can multiple different uses of annotations coexist? And if they can, how should a specific use be indicated? (Also, some questions about compile-time vs. run-time use.) (2) For type annotations, should we adopt (roughly) the mypy syntax or the alternative proposed by Dave Halter? This uses built-in container notations as a shorthand, e.g. {str: int} instead of Dict[str, int]. This also touches on the issue of abstract vs. concrete types (e.g. iterable vs. list). Regarding (1), I continue to believe that we should eventually reserve annotations for types, to avoid confusing both humans and tools, but I think there's nothing we have to do in 3.5 -- 3.5 must preserve backward compatibility, and we're not proposing to give annotations any new semantics anyway -- the actual changes to CPython are limited to a new stdlib module (typing) and some documentation. Perhaps a thornier issue is how mypy should handle decorators that manipulate the signature or annotations of the function they wrap. But I think the only reasonable answer here can be that mypy must understand what decorators do if it wants to have any chance at type-checking decorated functions. I don't actually know how sophisticated mypy's understanding of decorators is, currently, but I don't think there's anything fundamentally more difficult than all the other things it must understand. Moving on to (2), the proposal is elegant enough by itself, and indeed has the advantage of being clear and concise: [T] instead of List[T], {T: U} instead of Dict[T, U], and so on. However, there are a few concerns. My first concern is that these expressions are only unambiguous in the context of function annotations. I want to promote the use of type aliases, and I think in general a type alias should behave similarly to an ABC. In particular, I think that any object used to represent a type in an annotation should itself be a type object (though you may not be able to instantiate it), and e.g. [int] doesn't satisfy that requirement. Without this, it would be difficult to implement isinstance() and issubclass() for type aliases -- and while we could special-case lists, sets and dicts, using a tuple *already* has a meaning! The second concern is that the proposal seems to steer users in the direction of using concrete types. A lot of Python's power stems from concepts like iterable and mapping and their variants (e.g. iterable, container, sequence, mutable sequence). There are justified concerns that users will unnecessarily constrain the argument types more than necessary (e.g. specifying a sequence where any iterable would do), and this proposal lacks the subtlety to express the difference. A third (minor) concern reflects issue (1): until we have agreement that annotations should only be used as type annotations, a type checker cannot assume that the presence of annotations means that types should be checked. Using e.g. Iterable[int] is pretty unambiguous (especially when Iterable is imported from typing.py), whereas just [int] is somewhat ambiguous. I call this only a minor issue because it still occurs for simple types like int or str, so if we can live with it for those we could presumably live with [int] and {str: float}. All in all I prefer the mypy syntax, despite being somewhat more verbose and requiring an import, with one caveat: I agree that it would be nicer if the mypy abstract collection types were the same objects as the ABCs exported by collections.abc. I'm not quite sure whether we should also change the concrete collection types from List, Dict, Set, Tuple to list, dict, set, tuple; the concrete types are so ubiquitous that I worry that there may be working code out there that somehow relies on the type objects themselves not being subscriptable. A mostly unrelated issue: there are two different uses of tuples, and we need a notation for both. One is a tuple of fixed length with heterogeneous, specific types for the elements; for example Tuple[int, float]. But I think we also need a way to indicate that a function expects (or returns) a variable-length tuple with a homogeneous element type. Perhaps we should call this type frozenlist, analogous to frozenset (and it seems there's a proposal for frozendict making the rounds as well). -- --Guido van Rossum (python.org/~guido)
On Aug 16, 2014, at 22:03, Guido van Rossum
Moving on to (2), the proposal is elegant enough by itself, and indeed has the advantage of being clear and concise: [T] instead of List[T], {T: U} instead of Dict[T, U], and so on. However, there are a few concerns.
My first concern is that these expressions are only unambiguous in the context of function annotations.
Good point. Together with your third point (that [str] could be meaningful as a different type of annotation, while Iterable[str] is incredibly unlikely to mean anything other than a static type check--except maybe a runtime type check, but I think it's reasonable to assume they can share annotations), I think this kills the idea. Pity, because I like the way it looked.
All in all I prefer the mypy syntax, despite being somewhat more verbose and requiring an import, with one caveat: I agree that it would be nicer if the mypy abstract collection types were the same objects as the ABCs exported by collections.abc. I'm not quite sure whether we should also change the concrete collection types from List, Dict, Set, Tuple to list, dict, set, tuple; the concrete types are so ubiquitous that I worry that there may be working code out there that somehow relies on the type objects themselves not being subscriptable.
I won't belabor the point, but again: I don't think we need a generic list type object, and without it, this entire problem--your only remaining problem that isn't a mere stylistic choice--vanishes.
A mostly unrelated issue: there are two different uses of tuples, and we need a notation for both. One is a tuple of fixed length with heterogeneous, specific types for the elements; for example Tuple[int, float]. But I think we also need a way to indicate that a function expects (or returns) a variable-length tuple with a homogeneous element type. Perhaps we should call this type frozenlist, analogous to frozenset (and it seems there's a proposal for frozendict making the rounds as well).
Even if you drop the idea for [str] and {int: str}, which I agree seems unavoidable, I think it may still make sense for (int, str) to mean a heterogeneous iterable. Python already has target lists, argument lists, parameter lists, and expression lists that all have the same syntax as tuples or a superset thereof, but don't define tuples. In (a, b) = zip(c, d), neither (a, b) nor (c, d) is a tuple, and I don't think anyone is confused by that. So, why can't def foo(spam: (int, str)) mean that spam is an iterable of an int and a str, in exactly the same way that the assignment statement means that a and b are assigned the result of unpacking the iterable returned by zip when called with c and d? And this leaves Tuple[str] or tuple[str] free to mean a homogenous tuple (although, again, I don't think we even want or need that...).
On Sat, Aug 16, 2014 at 11:02:01PM -0700, Andrew Barnert wrote:
On Aug 16, 2014, at 22:03, Guido van Rossum
wrote: Moving on to (2), the proposal is elegant enough by itself, and indeed has the advantage of being clear and concise: [T] instead of List[T], {T: U} instead of Dict[T, U], and so on. However, there are a few concerns.
My first concern is that these expressions are only unambiguous in the context of function annotations.
Good point. Together with your third point (that [str] could be meaningful as a different type of annotation, while Iterable[str] is incredibly unlikely to mean anything other than a static type check--except maybe a runtime type check, but I think it's reasonable to assume they can share annotations), I think this kills the idea. Pity, because I like the way it looked.
[str] looks nice, but it looks like a list of str, or possibly an optional str, e.g. from the docstring of int: int(x[, base]) -> integer What the [str] syntax doesn't look like is an Iterable of str. Or should that be Sequence of str? MutableSequence perhaps? If [str] means something other than list of str, it is going to be some arbitrary special case to be memorized. Have pity on people teaching Python. I don't want to have to try to explain to beginners why [str] sometimes means a list and sometimes an arbitrary Iterable (or whatever). This is just downright confusing: def func(arg:[str]): x = [str] assert isinstance(x, list) # Always passes. assert isinstance(arg, list) # Sometimes fails. func(iter("abc")) # Fails. [Guido]
All in all I prefer the mypy syntax, despite being somewhat more verbose and requiring an import, with one caveat: I agree that it would be nicer if the mypy abstract collection types were the same objects as the ABCs exported by collections.abc. I'm not quite sure whether we should also change the concrete collection types from List, Dict, Set, Tuple to list, dict, set, tuple;
We can start with typing.List, Dict, etc., and later on consider using builtins. I worry that if we use builtins, people will declare x:list[int] not because they *need* a list of int, but because it saves typing over from typing import Sequence, Integer def func(x:Sequence[Integer]): So even though I suggested earlier that the builtins grow appropriate __getitem__ methods, on second thoughts I would be very cautious about introducing that.
the concrete types are so ubiquitous that I worry that there may be working code out there that somehow relies on the type objects themselves not being subscriptable.
[Andrew]
I won't belabor the point, but again: I don't think we need a generic list type object, and without it, this entire problem--your only remaining problem that isn't a mere stylistic choice--vanishes.
I don't understand. If there's no list typing object, how do you declare a variable must be a list and nothing but a list? Or that it returns a list? [Guido]
A mostly unrelated issue: there are two different uses of tuples, and we need a notation for both. One is a tuple of fixed length with heterogeneous, specific types for the elements; for example Tuple[int, float]. But I think we also need a way to indicate that a function expects (or returns) a variable-length tuple with a homogeneous element type.
Throwing this idea out to be shot down: use some sort of slice notation. Tuple[int, float, str] # Like (23, 1.5, "spam") Tuple[::int, float, str] # Like (1, 2, 3, 4) or (1.5,) or ("x", "y") That is, if the argument to __getitem__ is a slice (None, None, T), T is either a type or a tuple of types. Any other kind of slice is reserved for the future, or an error.
Perhaps we should call this type frozenlist, analogous to frozenset (and it seems there's a proposal for frozendict making the rounds as well).
[Andrew]
Even if you drop the idea for [str] and {int: str}, which I agree seems unavoidable, I think it may still make sense for (int, str) to mean a heterogeneous iterable.
That makes no sense to me. It looks like a tuple, not a generic iterable object. Your interpretation has the same problems I discussed above for [str] notation: it is an arbitrary choice whether (...) means Iterable, Sequence or ImmutableSequence, and it doesn't fit nicely with other common uses of parens. See below.
Python already has target lists, argument lists, parameter lists, and expression lists that all have the same syntax as tuples or a superset thereof, but don't define tuples. In (a, b) = zip(c, d), neither (a, b) nor (c, d) is a tuple, and I don't think anyone is confused by that.
Ha , you've obviously stopped reading the "Multi-line with statement" thread on Python-Dev :-)
So, why can't def foo(spam: (int, str)) mean that spam is an iterable of an int and a str, in exactly the same way that the assignment statement means that a and b are assigned the result of unpacking the iterable returned by zip when called with c and d?
But a, b = zip(c, d) requires that there be exactly two elements, not some unspecified number. To me, spam:(int, str) has a natural interpretation that spam can be either an int or a str, not an Iterable or Sequence or even a tuple.
And this leaves Tuple[str] or tuple[str] free to mean a homogenous tuple (although, again, I don't think we even want or need that...).
We do. Consider the isinstance() function. Here's the signature according to its docstring: isinstance(object, class-or-type-or-tuple) -> bool The second argument can be a single type, or a tuple of an arbitrary number of types. I'd write it with annotations like: def isinstance(object:Any, class_or_type_or_tuple:(Type, Tuple[::Type]) )->Bool: assuming (a,b) means "either a or b" and Tuple[::a] means a homogenous tuple of a. (With shorter argument names, it even fits on one line.) And, here's issubclass: def issubclass(C:Type, D:(Type, Tuple[::Type]))->Bool: -- Steven
On Aug 17, 2014, at 0:26, Steven D'Aprano
On Sat, Aug 16, 2014 at 11:02:01PM -0700, Andrew Barnert wrote:
I won't belabor the point, but again: I don't think we need a generic list type object, and without it, this entire problem--your only remaining problem that isn't a mere stylistic choice--vanishes.
I don't understand. If there's no list typing object, how do you declare a variable must be a list and nothing but a list? Or that it returns a list?
You think about it and make sure you really do need a list and nothing but a list. Most of the time (as in all three of the examples given in this thread) this is a mistake. If it's not, then you use List. (Or, if the stdlib doesn't provide that, you have to write one line of code: List = TypeAlias(list), and then you can use it.) If having list[T] is going to be more of an attractive nuisance than a useful feature, and it will be especially attractive and nuisanceful for exactly the same novices who are unlikely to know how to TypeAlias it themselves, why is it a problem to leave it out?
Even if you drop the idea for [str] and {int: str}, which I agree seems unavoidable, I think it may still make sense for (int, str) to mean a heterogeneous iterable.
That makes no sense to me. It looks like a tuple, not a generic iterable object. Your interpretation has the same problems I discussed above for [str] notation: it is an arbitrary choice whether (...) means Iterable, Sequence or ImmutableSequence, and it doesn't fit nicely with other common uses of parens. See below.
Python already has target lists, argument lists, parameter lists, and expression lists that all have the same syntax as tuples or a superset thereof, but don't define tuples. In (a, b) = zip(c, d), neither (a, b) nor (c, d) is a tuple, and I don't think anyone is confused by that.
Ha , you've obviously stopped reading the "Multi-line with statement" thread on Python-Dev :-)
OK, granted, only 4 of the 5 attempts to reuse the comma-separated lists in Python have been 100% successful. Still not a bad batting average.
So, why can't def foo(spam: (int, str)) mean that spam is an iterable of an int and a str, in exactly the same way that the assignment statement means that a and b are assigned the result of unpacking the iterable returned by zip when called with c and d?
But a, b = zip(c, d) requires that there be exactly two elements, not some unspecified number.
And spam:(int, str) requires that there be exactly two elements (and that the first be an int and the second a str), not some unspecified number. How is that any different?
To me, spam:(int, str) has a natural interpretation that spam can be either an int or a str, not an Iterable or Sequence or even a tuple.
OK, I see the parallel there with exception statements now that you mention it. But almost anywhere else in Python, a comma-separated list is a sequence of values, targets, parameters, etc., not a disjunction. The obvious way to spell what you want here is "int | str" (and the fact that it was independently suggested three times on this thread and no other alternatives have been suggested until now makes me feel pretty confident that it really is the obvious way). Of course there is a _different_ alternative that could be borrowed from some of the typed functional languages: int * str. But I don't think that's at all obvious to a Python reader.
On Sun, Aug 17, 2014 at 01:52:21AM -0700, Andrew Barnert wrote:
On Aug 17, 2014, at 0:26, Steven D'Aprano
wrote: On Sat, Aug 16, 2014 at 11:02:01PM -0700, Andrew Barnert wrote:
I won't belabor the point, but again: I don't think we need a generic list type object, and without it, this entire problem--your only remaining problem that isn't a mere stylistic choice--vanishes.
I don't understand. If there's no list typing object, how do you declare a variable must be a list and nothing but a list? Or that it returns a list?
You think about it and make sure you really do need a list and nothing but a list. Most of the time (as in all three of the examples given in this thread) this is a mistake. If it's not, then you use List. (Or, if the stdlib doesn't provide that, you have to write one line of code: List = TypeAlias(list), and then you can use it.)
Ah, that is the point I missed. You think that the stdlib shouldn't provide a standard typing object for lists, but that people should just create their own if they need it. Okay, but I don't understand why you're singling out lists. If you want to propose providing only abstract classes (Sequence, Mapping, etc.) and not concrete classes (list, dict, etc.) by default, that makes some sense to me. But I don't understand including typing.Dict as an alias for dict (say) but not List.
If having list[T] is going to be more of an attractive nuisance than a useful feature, and it will be especially attractive and nuisanceful for exactly the same novices who are unlikely to know how to TypeAlias it themselves, why is it a problem to leave it out?
There are at least three scenarios: (1) Built-ins can be used directly in static type annotations: x:list[dict] This has the advantage of not needing special names, but the disadvantage of encouraging lazy programmers to specify concrete types when they should be using abstract Sequence[Mapping]. (2) Built-ins *cannot* be used, you have to import them from typing: from typing import List, Dict x:List[Dict] The advantage is that since you have to do an import anyway, it is not much more effort to Do The Right Thing: from typing import Sequence, Mapping x:Sequence[Mapping] (3) And the final scenario, the one which confuses me, but seems to be what you are suggesting: you can use the built-ins, *but not list*, and there is no List to import either: from typing import Sequence x:Sequence[dict] I don't understand the advantage of this. [...]
So, why can't def foo(spam: (int, str)) mean that spam is an iterable of an int and a str, in exactly the same way that the assignment statement means that a and b are assigned the result of unpacking the iterable returned by zip when called with c and d?
But a, b = zip(c, d) requires that there be exactly two elements, not some unspecified number.
And spam:(int, str) requires that there be exactly two elements (and that the first be an int and the second a str), not some unspecified number. How is that any different?
Ah, that's what I didn't understand. I thought you meant an iterable of either ints or strs, without requiring a fixed number of them. I must admit, I just assumed that (based on the example of isinstance, and general Python practice), unions of types would be represented as a tuple, but I see that mypy uses Union[int, str]. In other words, I was thinking: Iterable[Union[int, str]] == Iterable[(int, str)] and thought you wanted to drop the Iterable[ ] and just be left with the (int, str).
To me, spam:(int, str) has a natural interpretation that spam can be either an int or a str, not an Iterable or Sequence or even a tuple.
OK, I see the parallel there with exception statements now that you mention it.
But almost anywhere else in Python, a comma-separated list is a sequence of values, targets, parameters, etc., not a disjunction. The obvious way to spell what you want here is "int | str"
Which mypy spells as Union[ ]. http://mypy-lang.org/tutorial.html -- Steven
On Aug 17, 2014, at 2:41, Steven D'Aprano
On Sun, Aug 17, 2014 at 01:52:21AM -0700, Andrew Barnert wrote:
On Aug 17, 2014, at 0:26, Steven D'Aprano
wrote: On Sat, Aug 16, 2014 at 11:02:01PM -0700, Andrew Barnert wrote:
I won't belabor the point, but again: I don't think we need a generic list type object, and without it, this entire problem--your only remaining problem that isn't a mere stylistic choice--vanishes.
I don't understand. If there's no list typing object, how do you declare a variable must be a list and nothing but a list? Or that it returns a list?
You think about it and make sure you really do need a list and nothing but a list. Most of the time (as in all three of the examples given in this thread) this is a mistake. If it's not, then you use List. (Or, if the stdlib doesn't provide that, you have to write one line of code: List = TypeAlias(list), and then you can use it.)
Ah, that is the point I missed. You think that the stdlib shouldn't provide a standard typing object for lists, but that people should just create their own if they need it.
Okay, but I don't understand why you're singling out lists. If you want to propose providing only abstract classes (Sequence, Mapping, etc.) and not concrete classes (list, dict, etc.) by default, that makes some sense to me. But I don't understand including typing.Dict as an alias for dict (say) but not List.
I'm not singling out lists. I referred to tuple in exactly the same way in the same message. I didn't mention dict, frozenset, etc. because I'd already given the full argument a few hundred messages ago and I didn't think anyone wanted to read it again. But the "if" sentence in your paragraph is a good summary of the whole thing.
If having list[T] is going to be more of an attractive nuisance than a useful feature, and it will be especially attractive and nuisanceful for exactly the same novices who are unlikely to know how to TypeAlias it themselves, why is it a problem to leave it out?
There are at least three scenarios:
(1) Built-ins can be used directly in static type annotations:
x:list[dict]
This has the advantage of not needing special names, but the disadvantage of encouraging lazy programmers to specify concrete types when they should be using abstract Sequence[Mapping].
(2) Built-ins *cannot* be used, you have to import them from typing:
from typing import List, Dict x:List[Dict]
The advantage is that since you have to do an import anyway, it is not much more effort to Do The Right Thing:
from typing import Sequence, Mapping x:Sequence[Mapping]
(3) And the final scenario, the one which confuses me, but seems to be what you are suggesting: you can use the built-ins, *but not list*, and there is no List to import either:
from typing import Sequence x:Sequence[dict]
I don't understand the advantage of this.
As explained above, I'm suggesting (2), not (3).
[...]
So, why can't def foo(spam: (int, str)) mean that spam is an iterable of an int and a str, in exactly the same way that the assignment statement means that a and b are assigned the result of unpacking the iterable returned by zip when called with c and d?
But a, b = zip(c, d) requires that there be exactly two elements, not some unspecified number.
And spam:(int, str) requires that there be exactly two elements (and that the first be an int and the second a str), not some unspecified number. How is that any different?
Ah, that's what I didn't understand. I thought you meant an iterable of either ints or strs, without requiring a fixed number of them.
Right. The "paradigm case" for tuples is as fixed-length, heterogeneous collections, where each index has a specific semantic meaning (and therefore type). The problem is that, at least in Python, tuples are _also_ used as general-purpose immutable sequences--and, in a few cases, that's specifically enshrined in syntax (e.g., the exception types in an except statement) or builtins (e.g., the arguments to str.__mod__), so we can't just ignore that. My suggestion is that (int, str) means the first (what you'd call int * str if you wanted your language to look more like type theory than something a normal human would write), so Tuple[int] works exactly like every other generic type: it's a homogenous collection of ints.
To me, spam:(int, str) has a natural interpretation that spam can be either an int or a str, not an Iterable or Sequence or even a tuple.
OK, I see the parallel there with exception statements now that you mention it.
But almost anywhere else in Python, a comma-separated list is a sequence of values, targets, parameters, etc., not a disjunction. The obvious way to spell what you want here is "int | str"
Which mypy spells as Union[ ].
Yes, but multiple people in this thread have suggested spelling it as int | str, nobody's objected, and both Jukka and Guido have given at least tentative assent. (I realize there's a whole lot of messages to read through and remember. And I could easily have missed an argument against this syntax just as you missed the suggestions for it.)
Guido van Rossum schrieb am 17.08.2014 um 07:03:
I'd like to summarize the main issues that have come up. As an experiment, I'm not changing the subject, but I am still not quoting anything in particular. Only two issues (or issue clusters) really seem contentious:
(1) Should the function annotation syntax (eventually) be reserved for type annotations in a standard syntax? Or can multiple different uses of annotations coexist? And if they can, how should a specific use be indicated? (Also, some questions about compile-time vs. run-time use.)
Regarding (1), I continue to believe that we should eventually reserve annotations for types, to avoid confusing both humans and tools, but I think there's nothing we have to do in 3.5 -- 3.5 must preserve backward compatibility, and we're not proposing to give annotations any new semantics anyway -- the actual changes to CPython are limited to a new stdlib module (typing) and some documentation.
As I mentioned before, there is more than one kind of type, even if we stick to reserving annotations for type declarations. That's why Cython currently supports these four ways of type annotations (in addition to its own non-Python way with "cdef"): x: dict x: {"type": dict} x: {"type": "dict"} x: {"ctype": "long double"} The latter three can also be combined, so you could declare a C type for Cython compilation and a Python type for your IDE and other static Python analysis tools, e.g. x: {"type": int, "ctype": "size_t"} Note that this also helps at a documentation level. The expected input is a Python int, but in fact it's restricted to a C size_t by the native implementation. I'd still vote for allowing the simpler "x: dict" as well for cases where it's the only annotation. It's easy enough to switch to the explicit notation when you want to add a second (potentially non-type) annotation. So, rather than "reserving" annotations for type declarations, I vote for making type annotations the default, but allowing other annotations by putting them into a dict that gives each annotation a string name. That name could be a module name in the stdlib or on PyPI, for example.
(2) For type annotations, should we adopt (roughly) the mypy syntax or the alternative proposed by Dave Halter? This uses built-in container notations as a shorthand, e.g. {str: int} instead of Dict[str, int]. This also touches on the issue of abstract vs. concrete types (e.g. iterable vs. list).
I talked to him at EP14 and we agreed that the simpler syntax looks tempting. However, it does not support protocols, so it still needs something that allows us to say Iterable(int) in some way. I always thought that the ABCs were made for that, but so far everyone seemed to think that we need something different again. I'm happy to see that your preference also goes in that direction now. Having yet another typing module seems like an unnecessary duplication of the type system. Stefan
Stefan Behnel wrote:
However, it does not support protocols, so it still needs something that allows us to say Iterable(int) in some way.
Just had a thought -- does mypy provide a way to express a type that supports more than one protocol? E.g. can you say that something must be both Iterable and Hashable? -- Greg
On Aug 17, 2014, at 2:33 AM, Greg Ewing
Stefan Behnel wrote:
However, it does not support protocols, so it still needs something that allows us to say Iterable(int) in some way.
Just had a thought -- does mypy provide a way to express a type that supports more than one protocol? E.g. can you say that something must be both Iterable and Hashable?
That would be union types. Current syntax: Union[Iterable[int], Hashable] Proposed more concise syntax: Iterable[int] | Hashable -- Best regards, Łukasz Langa WWW: http://lukasz.langa.pl/ Twitter: @llanga IRC: ambv on #python-dev
On Aug 17, 2014, at 2:44 AM, Łukasz Langa
On Aug 17, 2014, at 2:33 AM, Greg Ewing
wrote: Stefan Behnel wrote:
However, it does not support protocols, so it still needs something that allows us to say Iterable(int) in some way.
Just had a thought -- does mypy provide a way to express a type that supports more than one protocol? E.g. can you say that something must be both Iterable and Hashable?
That would be union types. Current syntax: Union[Iterable[int], Hashable]
Proposed more concise syntax: Iterable[int] | Hashable
Ah, scratch that. What Greg asked about would be Iterable[int] & Hashable ;-) Don't know if this is supported but makes sense. -- Best regards, Łukasz Langa WWW: http://lukasz.langa.pl/ Twitter: @llanga IRC: ambv on #python-dev
On Sun, Aug 17, 2014 at 02:44:32AM -0700, Łukasz Langa wrote:
On Aug 17, 2014, at 2:33 AM, Greg Ewing
wrote: Stefan Behnel wrote:
However, it does not support protocols, so it still needs something that allows us to say Iterable(int) in some way.
Just had a thought -- does mypy provide a way to express a type that supports more than one protocol? E.g. can you say that something must be both Iterable and Hashable?
That would be union types. Current syntax: Union[Iterable[int], Hashable]
I don't think so. The mypy tutorial says: Use the Union[...] type constructor to construct a union type. For example, the type Union[int, str] is compatible with both integers and strings. You can use an isinstance check to narrow down the type to a specific type ... which implies that x:Union[a, b] means: isinstance(x, (a, b)) rather than: isinstance(x, a) and isinstance(x, b) which I think is what Greg is asking for. -- Steven
On Sun, Aug 17, 2014 at 09:33:13PM +1200, Greg Ewing wrote:
Stefan Behnel wrote:
However, it does not support protocols, so it still needs something that allows us to say Iterable(int) in some way.
Just had a thought -- does mypy provide a way to express a type that supports more than one protocol? E.g. can you say that something must be both Iterable and Hashable?
mypy has Union[Iterable, Hashable], but that would mean anything iterable, or anything hashable, but not necessarily both at the same time. I don't see anything that says it must support both, but I've only gone through the tutorial: http://mypy-lang.org/tutorial.html -- Steven
On Aug 17, 2014, at 2:33, Greg Ewing
Stefan Behnel wrote:
However, it does not support protocols, so it still needs something that allows us to say Iterable(int) in some way.
Just had a thought -- does mypy provide a way to express a type that supports more than one protocol? E.g. can you say that something must be both Iterable and Hashable?
The obvious way that already works (with MyPy, and also with ABCs for isinstance checking): class HashableIterable(Iterable, Hashable): pass def spam(a: HashableIterable): pass But, there's no reason typing.py couldn't add a wrapper that does this automatically: class Multiple: @staticmethod def __getitem__(*types); return type(types[0])( '_'.join(t.__name__ for t in types), types, {}) def spam(a: Multiple[Iterable, Hashable]): pass And, on analogy with the proposal for | as a shortcut for Union, the base class could add: def __and__(self, other): return Multiple[self, other] def spam(a: Iterable & Hashable): pass
On Aug 17, 2014, at 3:53 AM, Andrew Barnert
The obvious way that already works (with MyPy, and also with ABCs for isinstance checking):
class HashableIterable(Iterable, Hashable): pass
def spam(a: HashableIterable): pass
No. Specifying a subclass of Iterable and Hashable cannot possibly mean that *any* type that is both Iterable and Hashable is behaving like your subclass.
from collections import * class IterableHashable(Iterable, Hashable): pass ... isinstance("str", Iterable) True isinstance("str", Hashable) True isinstance("str", IterableHashable) False
We would need a new construct like Union, maybe named AnySubclass, that checks True when all its base classes are in a given type's MRO. Some ABCs don't explicitly appear in a type's MRO but I fixed this problem with singledispatch. -- Best regards, Łukasz Langa WWW: http://lukasz.langa.pl/ Twitter: @llanga IRC: ambv on #python-dev
Andrew Barnert wrote:
The obvious way that already works (with MyPy, and also with ABCs for isinstance checking):
class HashableIterable(Iterable, Hashable): pass
def spam(a: HashableIterable): pass
That way might be obvious, but it's wrong. It says that spam() only takes an instance of the specific class HashableIterable or a subclass thereof. It won't accept anything else, even if it implements Iterable and Hashable perfectly well.
But, there's no reason typing.py couldn't add a wrapper that does this automatically:
class Multiple: @staticmethod def __getitem__(*types); return type(types[0])( '_'.join(t.__name__ for t in types), types, {})
Automating it won't make it any more right. There needs to be a distinct concept such as Intersection() as a counterpart to Union(). -- Greg
On Sunday, August 17, 2014 5:04 PM, Greg Ewing
wrote:
Andrew Barnert wrote: The obvious way that already works (with MyPy, and also with ABCs for isinstance checking):
class HashableIterable(Iterable, Hashable): pass
def spam(a: HashableIterable): pass
That way might be obvious, but it's wrong. It says that spam() only takes an instance of the specific class HashableIterable or a subclass thereof. It won't accept anything else, even if it implements Iterable and Hashable perfectly well.
typing.Iterable and typing.Hashable are both typing.Protocols, so HashableIterable is also a typing.Protocol, so, if I understand Protocol correctly, isinstance will return True iff every method and attribute in HashableIterable (that is, every method and attribute in Iterable, plus every method and attribute in Hashable) is implemented by the type. Which is what we want here, right? That obviously doesn't work for nominal rather than structural ABCs, or for ad-hoc structural ABCs like the ones in collections.abc that don't automatically compose, but isn't that the whole point of Protocol, to provide structural subtyping that follows the rules of structural rather than nominative subtyping?
But, there's no reason typing.py couldn't add a wrapper that does
this automatically:
class Multiple: @staticmethod def __getitem__(*types); return type(types[0])( '_'.join(t.__name__ for t in types), types,
{})
Automating it won't make it any more right. There needs to be a distinct concept such as Intersection() as a counterpart to Union().
-- Greg _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Andrew Barnert wrote:
typing.Iterable and typing.Hashable are both typing.Protocols, so HashableIterable is also a typing.Protocol, so, if I understand Protocol correctly, isinstance will return True iff every method and attribute in HashableIterable (that is, every method and attribute in Iterable, plus every method and attribute in Hashable) is implemented by the type.
Urg. In that case, how do you spell a type that *is* a concrete class that implements Hashable and Iterable, rather than a new protocol? There's too much magic going on here for my liking. -- Greg
Guido van Rossum
A mostly unrelated issue: there are two different uses of tuples, and we need a notation for both. One is a tuple of fixed length with heterogeneous, specific types for the elements; for example Tuple[int, float].
That's the meaning of a tuple data structure, to me.
But I think we also need a way to indicate that a function expects (or returns) a variable-length tuple with a homogeneous element type.
Why? What real-world uses are there, where a list won't do the job adequately? I have encountered many uses of “homogeneous, variable-length sequence” and every time a Python tuple is used for that, I perceive a Python list would be better precisely *because* it better indicates that semantic meaning. I'd like to know how you think that's not true, and what real-world code makes you think so. -- \ “Contentment is a pearl of great price, and whosoever procures | `\ it at the expense of ten thousand desires makes a wise and | _o__) happy purchase.” —J. Balguy | Ben Finney
On 8/17/2014 4:23 AM, Ben Finney wrote:
Guido van Rossum
writes: A mostly unrelated issue: there are two different uses of tuples, and we need a notation for both. One is a tuple of fixed length with heterogeneous, specific types for the elements; for example Tuple[int, float].
That's the meaning of a tuple data structure, to me.
But I think we also need a way to indicate that a function expects (or returns) a variable-length tuple with a homogeneous element type.
There are also fixed-length homogeneous structures, like points.
Why? What real-world uses are there, where a list won't do the job adequately?
Variable-length homogenous tuples are part of python syntax in multiple places. Tuples can be hashed and put in sets an used as dict keys, lists cannot. Tuple contants are calculated just once when the code is compiled (and typically saved as .pyc). -- Terry Jan Reedy
Terry Reedy
On 8/17/2014 4:23 AM, Ben Finney wrote:
Guido van Rossum
writes: A mostly unrelated issue: there are two different uses of tuples, and we need a notation for both. One is a tuple of fixed length with heterogeneous, specific types for the elements; for example Tuple[int, float].
That's the meaning of a tuple data structure, to me.
But I think we also need a way to indicate that a function expects (or returns) a variable-length tuple with a homogeneous element type.
There are also fixed-length homogeneous structures, like points.
I assume you mean where the sequence is something like ‘(x, y, z)’ where each position has a meaning specific to that position. That's not homogeneous as I understood this usage, because the psitions are not homogeneous in meaning. The number 7.03 has very different meaning in the first, second, or third positions. A homogeneous sequence would imply there are deliberately *no* specific meanings to each position. The value 7.03 would have the same semantic value at any position in the sequence. So a fixed-length sequence where each position implies a special meaning is a heterogeneous sequence, and I agree that's an excellent use for a tuple.
Why? What real-world uses are there, where a list won't do the job adequately?
Variable-length homogenous tuples are part of python syntax in multiple places.
Tuples can be hashed and put in sets an used as dict keys, lists cannot. Tuple contants are calculated just once when the code is compiled (and typically saved as .pyc).
Okay, so these are not for the semantic purpose a tuple is for. I think a putative “frozenlist” type is best for that, to keep the heterogeneous implication of a tuple separate from the homogeneous implication of a list. -- \ “I prayed for twenty years but received no answer until I | `\ prayed with my legs.” —Frederick Douglass, escaped slave | _o__) | Ben Finney
On Sun, Aug 17, 2014 at 3:53 AM, Ben Finney
A homogeneous sequence would imply there are deliberately *no* specific meanings to each position.
That's a set. A sequence implies order matters in some way.
The value 7.03 would have the same semantic value at any position in the sequence.
Not at all. The 7 implies a units value, the 0 is a tenths value and the 3 is the hundredths. Those are different semantics. In Roman numerals before they invented subtraction, XVI = VIX so order did not matter. Roman numerals were basically sets of numbers to be added. Here's another: (KentuckyDerby, PreaknessStakes, BelmontStakes) -- these three objects are all instances of the class StakesRace and the order is significant. --- Bruce Learn how hackers think: http://j.mp/gruyere-security
On Sun, Aug 17, 2014 at 3:23 AM, Ben Finney
I have encountered many uses of “homogeneous, variable-length sequence” and every time a Python tuple is used for that, I perceive a Python list would be better precisely *because* it better indicates that semantic meaning.
Agreed. While "variable-length" doesn't imply mutability, they often seem to go hand-in-hand. Skip
On Sun, Aug 17, 2014 at 1:23 AM, Ben Finney
I have encountered many uses of “homogeneous, variable-length sequence” and every time a Python tuple is used for that, I perceive a Python list would be better precisely *because* it better indicates that semantic meaning.
I'd like to know how you think that's not true, and what real-world code makes you think so.
isinstance is real world code that for the second parameter accepts types and (recursively) tuples of any length of things it accepts. It doesn't accept lists, because then it would need to check for cycles. -- Devin
On Sun, Aug 17, 2014 at 4:44 AM, Devin Jeanpierre
On Sun, Aug 17, 2014 at 1:23 AM, Ben Finney
wrote: I have encountered many uses of “homogeneous, variable-length sequence” and every time a Python tuple is used for that, I perceive a Python list would be better precisely *because* it better indicates that semantic meaning.
I'd like to know how you think that's not true, and what real-world code makes you think so.
isinstance is real world code that for the second parameter accepts types and (recursively) tuples of any length of things it accepts.
It doesn't accept lists, because then it would need to check for cycles.
I was going to leave it at that -- the fact that a builtin works this way is an argument in favor of any standardized typing syntax supporting it -- but maybe it deserves justification, too. The key question is, given two values A and B that are valid second parameters to isinstance, how do you combine them into one valid second parameter, which forms the union of A and B? if A and B were always tuples, you could do A + B (isinstance predates sets). But we want to be able to do isinstance(x, SomeType), so A and B can't always be tuples. So Python adopts the convention that you can do (A, B) and, no matter what they are, isinstance(x, (A, B)) holds if and only if (isinstance(x, A) or isinstance(x, B)). As it happens we could use frozensets instead, but, oops. Given that frozensets are out of the question (in this case, because of dating, in other cases, because of unhashability or whatever) using sequences seems reasonable, and specifically using tuples so that we can avoid cycle checks (a messy algorithm) also seems entirely reasonable. I've also seen people use tuples to emphasize that the sequence is not mutated, but there's no reason to require that from a type signature in that case. -- Devin
On 17 August 2014 21:44, Devin Jeanpierre
On Sun, Aug 17, 2014 at 1:23 AM, Ben Finney
wrote: I have encountered many uses of “homogeneous, variable-length sequence” and every time a Python tuple is used for that, I perceive a Python list would be better precisely *because* it better indicates that semantic meaning.
I'd like to know how you think that's not true, and what real-world code makes you think so.
isinstance is real world code that for the second parameter accepts types and (recursively) tuples of any length of things it accepts.
There are a few other cases where tuples are special cased as arguments: - str.__mod__ - str.startswith (ditto for binary sequences) - str.endswith (ditto for binary sequences)
"aa".endswith(['a', 'b', 'c']) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: endswith first arg must be str or a tuple of str, not list
Searching the C files for "tuple of" turned up a couple more: * N-dimensional indexing also relies specifically on tuples-of-integers, rather than arbitrary iterators. * dynamic type creation expects to receive the bases as a tuple * the decimal module uses tuples of digits for internal data representations And that inspired recollection of several other cases where mutability would be wrong, because the tuple represents cached information rather than dynamic state: * various "*args" related APIs use "tuple of object" or "tuple of thing" (e.g. attributes of partial objects, internal storage in contextlib.ExitStack when used with arbitrary callbacks) * other introspection related APIs use tuples to report information about inspected objects * namedtuple _fields attributes are a tuple of strings * BaseException.args publishes the full args tuple passed to the constructor str.startswith, str.endswith, isinstance and issubclass use the "implied or" interpretation, everything else does not. In most cases, the immutability conveys relevant semantic information (usually indicating that it's a read-only API). Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
I think we're conflating multiple problems here.
Sometimes we use tuple to mean a homogeneous, arbitrary-length, immutable sequence, and other times we use it to mean a heterogeneous, fixed-length sequence. Nick's list demonstrates that the former is (a) common enough to worry about, and (b) not always a mistake.
But even if such uses were always a mistake, Python is obviously not going to add a frozenlist, with new display syntax, and change over all existing misuses of tuple (which would clearly require a full deprecation cycle) before adding static typing. So, we'd still need a way to statically type both uses.
And even if the homogeneous uses had never existed in the first place, a heterogeneous tuple would still not be a parametric type in the same sense as all of the other collections, the io classes, etc. So, we'd still need to distinguish it syntactically from all of the other generics.
So, arguing about whether we need to handle heterogeneous tuples specially is pointless; the only question is how we do it.
I can think of four possibilities:
1. Use a tuple of types to mean a tuple of types: (int, str).
This is exactly what we already do in isinstance, the except statement, etc. And it's how Swift, C++, D, and other languages specify a tuple of types (although Swift can also use a product).
This definitely would be potentially confusing if we went with the obiwan-inspired syntax of [str] for lists (or iterables or mutable sequences) of str, but since I think Guido has pretty conclusively argued against that syntax on other grounds, this isn't a problem.
This probably implies that function types should be written as Function[(str, int), int] instead of Function[[str, int], int], but I think MyPy already handles that, and I think it makes more sense anyway.
2. Use a product of types to mean a tuple: int*str.
Tuples really are just product types. This is how you specify them in type theory (and relational theory, and elsewhere). This is exactly how ML, Haskell, and other languages that designed their type systems carefully instead of haphazardly represent tuples of types. It also nicely parallels the suggestions for str|int for union types and str&int for multi-inherited subtypes.
The first big problem here is that there's no way to specify a tuple of one type. That's not a problem for theory-inspired languages, because in those languages, a tuple of one value is the same thing as that value, but that's obviously not true for Python. The fact that Python doesn't have an appropriate unit type (None is a value that people frequently want to use in tuples) is also at least a theoretical problem.
Also, unless we were going to change isinstance, except, etc. to accept (or require) this syntax instead of a tuple of types, I think it would be confusing to remember that you use a tuple of types in some places, a product of types in others. Unlike subscripting, this looks too similar, and too syntactic, to avoid confusion.
3. Use subscripting, but a different form of subscripting, as Steven suggested: Tuple[::int, str].
This isn't actually right; it doesn't mean Tuple[::(int, str)], but Tuple[(::int), str]. But let's assume it's possible to come up with a readable syntax that doesn't require parentheses and ignore that problem.
This implies a more general use of slicing in type subscription: the start is the type parameter, and the step is some other stuff to be used in a special way that's specific to that type. If we have other such uses, that would be a door worth leaving open, but I suspect we don't. (If we had dynamic/implicit named tuples, as people have suggested a few times, would that be relevant here?)
Also notice that this is still passing a tuple of types, it's just wrapping it up in a slice and then passing that to Tuple just so it can mark the tuple of types as actually meaning a tuple of types. Is that adding enough additional information to be worth all that additional verbosity and complexity?
4. Just use Tuple[int, str] and note that this is a special case that doesn't mean the same thing as other generic types.
It seems to me potentially very confusing to have Tuple[int] mean a homogeneous arbitrary-length immutable sequence of ints, while Tuple[int, str] means exactly one int and one str (and Tuple[int,] means exactly one int). You could argue that this confusion is inherent in Python's use of tuples for those two different cases in the first place, but we're still spreading that confusion further.
If there were no better alternatives, I think this might be better than Steven's suggestion (practicality beats purity, and his suggestion really doesn't remove that much confusion), but I think there are better alternatives—namely, the first one.
On Sunday, August 17, 2014 5:44 AM, Nick Coghlan
On 17 August 2014 21:44, Devin Jeanpierre
wrote: On Sun, Aug 17, 2014 at 1:23 AM, Ben Finney
wrote: I have encountered many uses of “homogeneous, variable-length sequence” and every time a Python tuple is used for that, I perceive a Python list would be better precisely *because* it better indicates that semantic meaning.
I'd like to know how you think that's not true, and what real-world code makes you think so.
isinstance is real world code that for the second parameter accepts types and (recursively) tuples of any length of things it accepts.
There are a few other cases where tuples are special cased as arguments:
- str.__mod__ - str.startswith (ditto for binary sequences) - str.endswith (ditto for binary sequences)
"aa".endswith(['a', 'b', 'c']) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: endswith first arg must be str or a tuple of str, not list
Searching the C files for "tuple of" turned up a couple more:
* N-dimensional indexing also relies specifically on tuples-of-integers, rather than arbitrary iterators.
* dynamic type creation expects to receive the bases as a tuple
* the decimal module uses tuples of digits for internal data representations
And that inspired recollection of several other cases where mutability would be wrong, because the tuple represents cached information rather than dynamic state:
* various "*args" related APIs use "tuple of object" or "tuple of thing" (e.g. attributes of partial objects, internal storage in contextlib.ExitStack when used with arbitrary callbacks)
* other introspection related APIs use tuples to report information about inspected objects
* namedtuple _fields attributes are a tuple of strings
* BaseException.args publishes the full args tuple passed to the constructor
str.startswith, str.endswith, isinstance and issubclass use the "implied or" interpretation, everything else does not. In most cases, the immutability conveys relevant semantic information (usually indicating that it's a read-only API).
Cheers, Nick.
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
Where is it said that Tuple[int] is a homogeneous variable size list?
On Sunday, August 17, 2014 1:34 PM, Guido van Rossum
Where is it said that Tuple[int] is a homogeneous variable size list?
(I'm assuming you're referring to the homogeneity and arbitrary length here, not the fact that someone presumably said "list" when they meant "tuple", because otherwise the answer is trivial…) First, that's how the current typing.py interprets it: Tuple[str] is a homogeneous, arbitrary-length (although of course unchanging, because it's immutable) tuple of strings. Second, what else _would_ it mean? If List[str] and Set[str] mean homogeneous arbitrary-length lists and sets of strs, and the same goes for Iterable[str] and MutableSequence[str] and IO[str] and AnyStr[str] and every other example in typing.py, it would be pretty surprising if it weren't the same for Tuple[str]. Third, if it didn't mean that, how would you define the argument types to any of Nick's examples? For example: def isinstance(obj: object, types: type | Tuple[type]) -> bool: That had better mean a homogeneous arbitrary-length tuple of types; if not, there doesn't seem to be any other way to declare its type.
I still think you are mistaken. I don't think mypy has a way to spell a homogeneous arbitrary-length tuple. All uses of Tuple[...] refer to "anonymous struct" tuples. I tried this: from typing import Tuple def f(a: Tuple[int]) -> None: pass def main() -> None: f((1,)) f((1, 2)) and I get an error for the second call, f((1, 2)): a.py: In function "main": a.py, line 8: Argument 1 to "f" has incompatible type "Tuple[int, int]"; expected "Tuple[int]" On Sun, Aug 17, 2014 at 3:46 PM, Andrew Barnert < abarnert@yahoo.com.dmarc.invalid> wrote:
On Sunday, August 17, 2014 1:34 PM, Guido van Rossum
wrote: Where is it said that Tuple[int] is a homogeneous variable size list?
(I'm assuming you're referring to the homogeneity and arbitrary length here, not the fact that someone presumably said "list" when they meant "tuple", because otherwise the answer is trivial…)
First, that's how the current typing.py interprets it: Tuple[str] is a homogeneous, arbitrary-length (although of course unchanging, because it's immutable) tuple of strings.
Second, what else _would_ it mean? If List[str] and Set[str] mean homogeneous arbitrary-length lists and sets of strs, and the same goes for Iterable[str] and MutableSequence[str] and IO[str] and AnyStr[str] and every other example in typing.py, it would be pretty surprising if it weren't the same for Tuple[str].
Third, if it didn't mean that, how would you define the argument types to any of Nick's examples? For example:
def isinstance(obj: object, types: type | Tuple[type]) -> bool:
That had better mean a homogeneous arbitrary-length tuple of types; if not, there doesn't seem to be any other way to declare its type.
-- --Guido van Rossum (python.org/~guido)
I sent that a little too soon; I should add that I think this is the right
way; and that's why I keep suggesting a different way to spell a
homogeneous variable-length tuple. 1-tuples should not be special.
On Sun, Aug 17, 2014 at 4:15 PM, Guido van Rossum
I still think you are mistaken. I don't think mypy has a way to spell a homogeneous arbitrary-length tuple. All uses of Tuple[...] refer to "anonymous struct" tuples.
I tried this:
from typing import Tuple
def f(a: Tuple[int]) -> None: pass
def main() -> None: f((1,)) f((1, 2))
and I get an error for the second call, f((1, 2)):
a.py: In function "main": a.py, line 8: Argument 1 to "f" has incompatible type "Tuple[int, int]"; expected "Tuple[int]"
On Sun, Aug 17, 2014 at 3:46 PM, Andrew Barnert < abarnert@yahoo.com.dmarc.invalid> wrote:
On Sunday, August 17, 2014 1:34 PM, Guido van Rossum
wrote: Where is it said that Tuple[int] is a homogeneous variable size list?
(I'm assuming you're referring to the homogeneity and arbitrary length here, not the fact that someone presumably said "list" when they meant "tuple", because otherwise the answer is trivial…)
First, that's how the current typing.py interprets it: Tuple[str] is a homogeneous, arbitrary-length (although of course unchanging, because it's immutable) tuple of strings.
Second, what else _would_ it mean? If List[str] and Set[str] mean homogeneous arbitrary-length lists and sets of strs, and the same goes for Iterable[str] and MutableSequence[str] and IO[str] and AnyStr[str] and every other example in typing.py, it would be pretty surprising if it weren't the same for Tuple[str].
Third, if it didn't mean that, how would you define the argument types to any of Nick's examples? For example:
def isinstance(obj: object, types: type | Tuple[type]) -> bool:
That had better mean a homogeneous arbitrary-length tuple of types; if not, there doesn't seem to be any other way to declare its type.
-- --Guido van Rossum (python.org/~guido)
-- --Guido van Rossum (python.org/~guido)
On Aug 17, 2014, at 3:46 PM, Andrew Barnert
Second, what else _would_ it mean? If List[str] and Set[str] mean homogeneous arbitrary-length lists and sets of strs, and the same goes for Iterable[str] and MutableSequence[str] and IO[str] and AnyStr[str] and every other example in typing.py, it would be pretty surprising if it weren't the same for Tuple[str].
You're playing the uniformity card and usually I would agree with you. In this case, there is no uniformity between different *data structures*. Nobody expects to be able to say: int[int], dict[str], set[str: int]. Tuples were meant to be heterogenic, Raymond draws that distinction in many classes and talks he gives. On the other hand, some types are hard to be typed heterogenically, e.g. you can't reasonably specify an Iterable that yield strings except for the third yield, which is an int. All in all, I think it's not at all confusing to say that tuple[int, int] means a point, for instance (1, 1). That being said, we will need support for homogenous tuples, too, simply because they are already in the wild. I proposed tuple[int, ...], which is explicit and obvious (if you're Polish, that is). -- Best regards, Łukasz Langa WWW: http://lukasz.langa.pl/ Twitter: @llanga IRC: ambv on #python-dev
On Mon, Aug 18, 2014 at 9:35 AM, Łukasz Langa
That being said, we will need support for homogenous tuples, too, simply because they are already in the wild. I proposed tuple[int, ...], which is explicit and obvious (if you're Polish, that is).
Conceptually, these kinds of constructs are sometimes called frozenlists. Why not actually create that type? frozenlist = tuple; Et voila. Now just define that frozenlist[int] is like list[int] rather than like tuple[int], and there you are, out of your difficulty at once! ChrisA
On Aug 17, 2014, at 4:40 PM, Chris Angelico
Et voila. Now just define that frozenlist[int] is like list[int] rather than like tuple[int], and there you are, out of your difficulty at once!
That's a neat trick and I think Guido suggested this naming before. It feels a little icky though, consider: - issubclass(tuple[int, int, int], frozenlist[int]) ? I think True. - issubclass(frozenlist[int], tuple[int, int, int]) ? I would think False. But because that's technically the same tuple underneath, *sometimes* instances of frozenlist[int] will respond True to isinstance(tuple[int, int, int]). Saying explicitly tuple[int, ...] takes that riddle away. -- Best regards, Łukasz Langa WWW: http://lukasz.langa.pl/ Twitter: @llanga IRC: ambv on #python-dev
On Mon, Aug 18, 2014 at 9:47 AM, Łukasz Langa
- issubclass(tuple[int, int, int], frozenlist[int]) ? I think True. - issubclass(frozenlist[int], tuple[int, int, int]) ? I would think False. But because that's technically the same tuple underneath, *sometimes* instances of frozenlist[int] will respond True to isinstance(tuple[int, int, int]).
I would say False to both of those, because they have different implications of intent. But it is arguable, so I'd be happy with the first one being True if it's easier to code that way. ChrisA
On Sun, Aug 17, 2014 at 4:47 PM, Łukasz Langa
On Aug 17, 2014, at 4:40 PM, Chris Angelico
wrote: Et voila. Now just define that frozenlist[int] is like list[int] rather than like tuple[int], and there you are, out of your difficulty at once!
That's a neat trick and I think Guido suggested this naming before. It feels a little icky though, consider:
- issubclass(tuple[int, int, int], frozenlist[int]) ? I think True. - issubclass(frozenlist[int], tuple[int, int, int]) ? I would think False. But because that's technically the same tuple underneath, *sometimes* instances of frozenlist[int] will respond True to isinstance(tuple[int, int, int]).
Saying explicitly tuple[int, ...] takes that riddle away.
Hm. That looks just as [tr]icky. Plus, I expect both pedants and ignoramuses may wonder about the empty tuple. :-) I think it deserves a proper name, not magic syntax. I do agree that frozentuple sounds odd, but perhaps we just need to get used to it. -- --Guido van Rossum (python.org/~guido)
On Sun, Aug 17, 2014 at 5:30 PM, Guido van Rossum
On Sun, Aug 17, 2014 at 4:47 PM, Łukasz Langa
wrote: That's a neat trick and I think Guido suggested this naming before. It feels a little icky though, consider:
- issubclass(tuple[int, int, int], frozenlist[int]) ? I think True. - issubclass(frozenlist[int], tuple[int, int, int]) ? I would think False. But because that's technically the same tuple underneath, *sometimes* instances of frozenlist[int] will respond True to isinstance(tuple[int, int, int]).
Saying explicitly tuple[int, ...] takes that riddle away.
Hm. That looks just as [tr]icky. Plus, I expect both pedants and ignoramuses may wonder about the empty tuple. :-) I think it deserves a proper name, not magic syntax.
I do agree that frozentuple sounds odd, but perhaps we just need to get used to it.
I've considered adding a mypy type for arbitrary-length tuples [1]. My original idea was to call it TupleSequence[T], since it's basically a concrete tuple that is used like a sequence. Fixed-length tuples can already be used as abstract iterables and sequences: def f(x: Iterable[int]) -> None: pass f((1, 2, 3)) # Ok [1] https://github.com/JukkaL/mypy/issues/184 Jukka
-- --Guido van Rossum (python.org/~guido)
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On Aug 17, 2014, at 7:30 PM, Guido van Rossum
wrote: I do agree that frozentuple sounds odd, but perhaps we just need to get used to it.
I’d expect that most uses of Tuple[int] to mean an arbitrary length tuple of integers would be better served with Sequence[int] anyway, so I’d definitely find it odd for Tuple[int] to mean anything other than a tuple with exactly one integer in it. I could imagine somebody wanting to say that it could be a Sequence but NOT a MutableSequence. Perhaps that could be spelled: Sequence[int] - MutableSequence[int] and/or (Sequence - MutableSequence)[int]
On Sunday, August 17, 2014 6:26 PM, Ryan Hiebert
I’d expect that most uses of Tuple[int] to mean an arbitrary length tuple of integers would be better served with Sequence[int] anyway
It would be nice if that were true, but unfortunately, Python has a long history of using tuple, and only tuple, both in the core language and in the stdlib, specifically to mean an arbitrary-length homogeneous tuple in APIs or syntax that take a single value or tuple of values. I don't want to repeat the whole list that Nick Coghlan provided, but I would like to add that any third-party code that interacts with those parts of the syntax or stdlib has to do the same thing (e.g., an exception-logging decorator has to taken an exception type or tuple of exception types to use in an except statement), and there's probably some third-party code that's done the same thing completely independently, just because it's endorsed by the stdlib.
On Aug 17, 2014, at 8:58 PM, Andrew Barnert
wrote: On Sunday, August 17, 2014 6:26 PM, Ryan Hiebert
wrote: I’d expect that most uses of Tuple[int] to mean an arbitrary length tuple of integers would be better served with Sequence[int] anyway
It would be nice if that were true, but unfortunately, Python has a long history of using tuple, and only tuple, both in the core language and in the stdlib, specifically to mean an arbitrary-length homogeneous tuple in APIs or syntax that take a single value or tuple of values.
I don't want to repeat the whole list that Nick Coghlan provided, but I would like to add that any third-party code that interacts with those parts of the syntax or stdlib has to do the same thing (e.g., an exception-logging decorator has to taken an exception type or tuple of exception types to use in an except statement), and there's probably some third-party code that's done the same thing completely independently, just because it's endorsed by the stdlib.
Thanks. As I read more, I’m recalling some apis that use tuples as special markers for iterables rather than, say, strings, so that there can be polymorphism based on the arguments, allowing both a string and a tuple. Iterable strings bite again ;-)
So we need a name for a type that's a tuple used as a sequence. How about Tuppence? -- Greg
Greg Ewing
So we need a name for a type that's a tuple used as a sequence.
Um. Isn't every tuple used as a sequence? So, the name “tuple” fits. Maybe you mean “used as a *homogeneous* sequence”. Or maybe you mean “used as a *variable-length* sequence”. Or something else? -- \ “Too many Indians spoil the golden egg.” —Sir Joh | `\ Bjelke-Petersen | _o__) | Ben Finney
On Aug 17, 2014, at 8:30 PM, Guido van Rossum
wrote: Plus, I expect both pedants and ignoramuses may wonder about the empty tuple. :-) I think it deserves a proper name, not magic syntax.
I agree, even though covering all cases in a consistent syntax does not seem hard: Tuple[int] - variable length homogeneous tuple. Tuple[int,float] - length 2 Tuple[int,] - length 1 Tuple[()] - length 0 Since length 1 and more so length 0 cases don't seem to get much use, a somewhat non-obvious syntax should be fine.
On Sunday, August 17, 2014 4:36 PM, Łukasz Langa
On Aug 17, 2014, at 3:46 PM, Andrew Barnert
wrote: Second, what else _would_ it mean? If List[str] and Set[str] mean homogeneous arbitrary-length lists and sets of strs, and the same goes for Iterable[str] and MutableSequence[str] and IO[str] and AnyStr[str] and every other example in typing.py, it would be pretty surprising if it weren't the same for Tuple[str].
You're playing the uniformity card and usually I would agree with you. In this case, there is no uniformity between different *data structures*.
There is uniformity between every single generic data structure defined by MyPy except Tuple. That makes Tuple an exceedingly special special case. And its specialness is almost invisible. I went through typing.py in more detail, and I'm pretty sure there is nothing there to indicate that the Tuple = TypeAlias(tuple) line is going to do anything different than all of the other TypeAlias calls. Whatever difference there is must be in the private, implementation-specific code.
Nobody expects to be able to say: int[int], dict[str], set[str: int].
But int and dict are not sequences; tuple is.
Tuples were meant to be heterogenic, Raymond draws that distinction in many classes and talks he gives.
Sure. But as Nick pointed out, there are places all over even the core language and the stdlib where they're used homogeneously. You can argue that was a mistake, but you can't just wish it away.
All in all, I think it's not at all confusing to say that tuple[int, int] means a point, for instance (1, 1).
I think it's far less confusing to say that (int, int) is the type of (1, 1). And other languages agree: $ ghci Prelude> set +t Prelude> (1, 1) (1,1) it :: (Integer, Integer) Prelude> ^D Leaving GHCi. $ xcrun swift 1> (1, 1) $R0: (Int, Int) = { 0 = 1 1 = 1 } 2> ^D Those languages don't try to make heterogeneous tuples look like parametric collection types. Why should Python?
Andrew Barnert wrote:
First, that's how the current typing.py interprets it: Tuple[str] is a homogeneous, arbitrary-length (although of course unchanging, because it's immutable) tuple of strings.
So how do you spell a heterogeneous tuple of length 1 containing a string? -- Greg
On Sun, Aug 17, 2014 at 11:19 PM, Greg Ewing
Andrew Barnert wrote:
First, that's how the current typing.py interprets it: Tuple[str] is a homogeneous, arbitrary-length (although of course unchanging, because it's immutable) tuple of strings.
So how do you spell a heterogeneous tuple of length 1 containing a string?
Tuple[str] :-) The only arbitrary-length tuple that mypy knows about is just 'tuple' and it isn't generic or necessarily homogeneous (it's dynamically typed, basically a catch-all for a completely arbitrary tuple). However, arbitrary-length, homogeneous tuples would be a nice addition, once we figure out how the type should be named. Jukka
-- Greg
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On Sun, Aug 17, 2014 at 01:34:36PM -0700, Guido van Rossum wrote:
Where is it said that Tuple[int] is a homogeneous variable size list?
Did you mean variable sized tuple rather than list? The mypy tutorial says the opposite: http://www.mypy-lang.org/tutorial.html#tuples which implies that Tuple[int] would accept (23,) but not (23, 42). There's two, or three, cases to consider: Variable sized tuple of some homogenous type - e.g. (1,), (1, 2) - Tuple[int] for consistency with other types - mypy doesn't appear to support this Fixed size tuple of given hetrogeneous types - e.g. (23, "abc"), (42, "xyz") - mypy uses Tuple[int, str] which is a special case Some way to specify two or more types, e.g. - (str, int) for consistency with isinstance, issubclass - Union[str, int] as used by mypy - str|int obvious short-cut for Union - str*int will make type theorists happy Making Tuple[] a special case troubles me, I strongly prefer Tuple[int] to mean a homogenous tuple of ints. mypy already has Union[str, int] for union types (giving str|int as the obvious short-cut), which leaves (str, int) for the hetrogenous 2-tuple case. Perhaps in 3.6 we can consider allowing isinstance and issubclass to accept Union types as well as tuples of types. -- Steven
On Aug 17, 2014, at 4:16 PM, Steven D'Aprano
Perhaps in 3.6 we can consider allowing isinstance and issubclass to accept Union types as well as tuples of types.
Why not 3.5? -- Best regards, Łukasz Langa WWW: http://lukasz.langa.pl/ Twitter: @llanga IRC: ambv on #python-dev
On Sun, Aug 17, 2014 at 04:19:16PM -0700, Łukasz Langa wrote:
On Aug 17, 2014, at 4:16 PM, Steven D'Aprano
wrote: Perhaps in 3.6 we can consider allowing isinstance and issubclass to accept Union types as well as tuples of types.
Why not 3.5?
In case Guido changes his mind :-) -- Steven
On Sunday, August 17, 2014 4:17 PM, Steven D'Aprano
There's two, or three, cases to consider:
Variable sized tuple of some homogenous type - e.g. (1,), (1, 2) - Tuple[int] for consistency with other types - mypy doesn't appear to support this
Are you sure? In [typing.py](https://github.com/JukkaL/mypy/blob/master/lib-typing/3.2/typing.py#L17), Tuple is defined as TypeAlias(tuple), exactly the same way List is defined as TypeAlias(list). And I don't see any code anywhere else in that module that adds the special-casing to it. So, isn't this what MyPy is already doing? Or is there some hidden functionality outside of typing.py that changes it? Anyway, I agree with you that, whatever it _currently_ means in MyPy, it _should_ mean a tuple of an arbitrary number of int values.
Fixed size tuple of given hetrogeneous types - e.g. (23, "abc"), (42, "xyz") - mypy uses Tuple[int, str] which is a special case
This is the one I'd like to write as (int, str).
Some way to specify two or more types, e.g. - (str, int) for consistency with isinstance, issubclass - Union[str, int] as used by mypy - str|int obvious short-cut for Union - str*int will make type theorists happy
That last one will not make type theorists happy. A product of two types is a tuple (your case #2, not #3). What you want is a sum or union of two types, which you'd write str|int, str U int, or maybe str+int.
Making Tuple[] a special case troubles me, I strongly prefer Tuple[int]
to mean a homogenous tuple of ints.
mypy already has Union[str, int] for union types (giving str|int as the obvious short-cut), which leaves (str, int) for the hetrogenous 2-tuple case.
Perhaps in 3.6 we can consider allowing isinstance and issubclass to accept Union types as well as tuples of types.
-- Steven _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On Sun, Aug 17, 2014 at 6:05 PM, Andrew Barnert < abarnert@yahoo.com.dmarc.invalid> wrote:
On Sunday, August 17, 2014 4:17 PM, Steven D'Aprano
wrote: There's two, or three, cases to consider:
Variable sized tuple of some homogenous type - e.g. (1,), (1, 2) - Tuple[int] for consistency with other types - mypy doesn't appear to support this
Are you sure?
In [typing.py]( https://github.com/JukkaL/mypy/blob/master/lib-typing/3.2/typing.py#L17), Tuple is defined as TypeAlias(tuple), exactly the same way List is defined as TypeAlias(list). And I don't see any code anywhere else in that module that adds the special-casing to it.
So, isn't this what MyPy is already doing? Or is there some hidden functionality outside of typing.py that changes it?
There is -- typing.py is not the typechecker, it's just a bunch of dummies crafted so that your code can also be executed by vanilla Python 3.2. (If you look at the definition of TypeAlias a few lines earlier, it has almost no semantics, and you can see that it is also used for wildly other types, e.g. Function and Union.
Anyway, I agree with you that, whatever it _currently_ means in MyPy, it _should_ mean a tuple of an arbitrary number of int values.
I disagree. (I've said that before, but there seems to be a large delay between our exchanges.) The Foo[bar, ...] notation has no inherent semantics -- compare for example Dict[str, int], Tuple[str, int], and Union[str, int]. The most useful way to define Tuple[T1, T2, ..., Tn] is to use it for an n-tuple whose element types are T1 etc.
Fixed size tuple of given hetrogeneous types - e.g. (23, "abc"), (42, "xyz") - mypy uses Tuple[int, str] which is a special case
This is the one I'd like to write as (int, str).
I know that's your proposal, but it seems to blind you for the other position. -- --Guido van Rossum (python.org/~guido)
: On Sun, Aug 17, 2014 at 12:31:19PM -0700, Andrew Barnert wrote:
I think we're conflating multiple problems here.
Sometimes we use tuple to mean a homogeneous, arbitrary-length, immutable sequence, and other times we use it to mean a heterogeneous, fixed-length sequence. Nick's list demonstrates that the former is (a) common enough to worry about, and (b) not always a mistake.
[...]
I can think of four possibilities:
Since we're already repurposing a number of operators - TypeA[TypeB], TypeA|TypeB and possibly TypeA&TypeB, here's a fifth possibility: Leave Tuple[int] with the same semantic meaning as List[int], Set[int] etc., and use e.g. Tuple%(int, str, float) to declare the signature of a fixed-length, heterogenous sequence (borrowing from the use of % in string interpolation). This has the advantages that it a) doesn't make tuple a special case, and so allows other container types to use it (including user-defined types which aren't tuples, but are intended to behave similarly to them), and b) is both reasonably compact, and easily distinguished from Tuple[int]. The specific operator used doesn't really matter - when this idea came to me I thought of @ (since it's shiny and new in 3.5), but % seems a better conceptual fit. -[]z. -- Zero Piraeus: unus multorum http://etiol.net/pubkey.asc
Guido van Rossum wrote:
Perhaps a thornier issue is how mypy should handle decorators that manipulate the signature or annotations of the function they wrap. But I think the only reasonable answer here can be that mypy must understand what decorators do if it wants to have any chance at type-checking decorated functions.
Seems to me the only way to do that in general is to execute the decorators. That means importing everything the decorators depend on and probably running at least the top-level module code. Is executing arbitrary code at type-checking time really desirable? -- Greg
On Sun, Aug 17, 2014 at 2:22 AM, Greg Ewing
Guido van Rossum wrote:
Perhaps a thornier issue is how mypy should handle decorators that
manipulate the signature or annotations of the function they wrap. But I think the only reasonable answer here can be that mypy must understand what decorators do if it wants to have any chance at type-checking decorated functions.
Seems to me the only way to do that in general is to execute the decorators. That means importing everything the decorators depend on and probably running at least the top-level module code. Is executing arbitrary code at type-checking time really desirable?
Nah, you just have to type-check the decorators. :-) -- --Guido van Rossum (python.org/~guido)
On Aug 16, 2014, at 10:03 PM, Guido van Rossum
I'd like to summarize the main issues that have come up. As an experiment, I'm not changing the subject, but I am still not quoting anything in particular. Only two issues (or issue clusters) really seem contentious:
(1) Should the function annotation syntax (eventually) be reserved for type annotations in a standard syntax? Or can multiple different uses of annotations coexist? And if they can, how should a specific use be indicated? (Also, some questions about compile-time vs. run-time use.)
Consider what Stefan Behnel is proposing: using function annotations for types by default, but in the presence of a dictionary, search for type in the 'type' key. This is very nice, provides a way to be concise if possible, and generic, if needed. My suggestion: we should support that.
All in all I prefer the mypy syntax, despite being somewhat more verbose and requiring an import, with one caveat: I agree that it would be nicer if the mypy abstract collection types were the same objects as the ABCs exported by collections.abc.
Good :) If the functionality will be implemented in the ABCs, what is the purpose of the typing module? My suggestion: if the functionality will be implemented in the ABCs, there's no need to introduce the "typing" module. We can back-port the new ABCs, for sure, but for Python 3.5 `collections` is enough (already has aliases to collections.abc0.
I'm not quite sure whether we should also change the concrete collection types from List, Dict, Set, Tuple to list, dict, set, tuple; the concrete types are so ubiquitous that I worry that there may be working code out there that somehow relies on the type objects themselves not being subscriptable.
While unlikely, such code can exist in the wild. That being said, I think builtins should support the one-obvious-way-to-do-it syntax for generics, if only for uniformity. Please note that there also can be code in the future that relies on type objects not to support binary-or. We will still need to add this, though, a type union of (int | str) or (str | None) will be a common thing. My suggestion: add __getitem__ and __or__/__ror__ to both builtins and ABCs. Steven D'Aprano touches on an interesting point that list[int] will be tempting for users because Iterable[int] is both longer and requires an import. We could extend PEP 8 to talk about typing and how people should think about introducing hints, but I think Steven is generally right: list[int] will sadly win. The real reason that we'll see list[T] everywhere is that Iterable[str] == str and Sequence[str] == str. Whoever will try and fail to use those abstract types to specify a collection of strings, but *not* a single string, will migrate to using concrete data types. And that's such a common use case! AFAIK, there is no good workaround at the moment. The solution for that, which we sadly can't implement, would be to make strings non-iterable. My suggestion: two new ABCs, let me temporarily call them StrictIterable and StrictSequence. Those would return False for issubclass(str, ...). My other suggestion: deprecate iterating over strings (and possibly bytes, too?). I'm not saying "remove", but just officially say "this was a bad idea, don't use it".
A mostly unrelated issue: there are two different uses of tuples, and we need a notation for both. One is a tuple of fixed length with heterogeneous, specific types for the elements; for example Tuple[int, float]. But I think we also need a way to indicate that a function expects (or returns) a variable-length tuple with a homogeneous element type. Perhaps we should call this type frozenlist, analogous to frozenset (and it seems there's a proposal for frozendict making the rounds as well).
On one hand, not being able to represent variable-length homogeneous tuples in the type hints will be a strong signal that this usage of tuples is disputable. On the other hand, we might *need* to support this to be compatible with existing framework code in the wild. frozenlists would be a nice, explicit solution for that but sadly they'd be a new collection, so no today's code returning/accepting var-length tuples would use that. My suggestion: tuple[int, ...] For uniformity, we would also accept this form for other iterables, I suppose. -- Best regards, Łukasz Langa WWW: http://lukasz.langa.pl/ Twitter: @llanga IRC: ambv on #python-dev
2014-08-17, 13:41, Łukasz Langa
On Aug 16, 2014, at 10:03 PM, Guido van Rossum
wrote: All in all I prefer the mypy syntax, despite being somewhat more verbose and requiring an import, with one caveat: I agree that it would be nicer if the mypy abstract collection types were the same objects as the ABCs exported by collections.abc.
Good :) If the functionality will be implemented in the ABCs, what is the purpose of the typing module?
My suggestion: if the functionality will be implemented in the ABCs, there's no need to introduce the "typing" module. We can back-port the new ABCs, for sure, but for Python 3.5 `collections` is enough (already has aliases to collections.abc0.
-1 for collections.abc classes, +1 for mypy's typing classes. There is a problem in static analysis of current types that are instances of abc.ABCMeta or types that just define their own __instancecheck__ / __subclasscheck__. Static analyzers cannot infer in general case what attributes of an instance / subclass do these methods check, because their body can be arbitrarily complex. Mypy's typing.Protocol subclasses are much easier to analyze statically, since they are required to explicitly define abstract methods as function defintions inside the class body. Current collectoins.abc classes do define their methods explicitly too, so it seems that at least these classes are fine. But their inheritors don't have to do it, they may just override __subclasshook__. And promoting abc.ABCMeta-based ABCs would mean that not all ABCs can be used as static type annotations. -- Andrey Vlasovskikh Web: http://pirx.ru/
On Sun Aug 17 2014 at 6:36:43 AM Andrey Vlasovskikh < andrey.vlasovskikh@gmail.com> wrote:
2014-08-17, 13:41, Łukasz Langa
wrote: On Aug 16, 2014, at 10:03 PM, Guido van Rossum
wrote: All in all I prefer the mypy syntax, despite being somewhat more verbose and requiring an import, with one caveat: I agree that it would be nicer if the mypy abstract collection types were the same objects as the ABCs exported by collections.abc.
Good :) If the functionality will be implemented in the ABCs, what is the purpose of the typing module?
My suggestion: if the functionality will be implemented in the ABCs, there's no need to introduce the "typing" module. We can back-port the new ABCs, for sure, but for Python 3.5 `collections` is enough (already has aliases to collections.abc0.
-1 for collections.abc classes, +1 for mypy's typing classes.
There is a problem in static analysis of current types that are instances of abc.ABCMeta or types that just define their own __instancecheck__ / __subclasscheck__. Static analyzers cannot infer in general case what attributes of an instance / subclass do these methods check, because their body can be arbitrarily complex.
That's only an issue if the type-checking code chooses to care about __instancecheck__/__subclasscheck__. The tool could choose to simply ignore those methods and treat them as a run-time only benefit for isinstance/issubclass checks but not for type checking. This is especially true if the check is being done on the AST instead of imported code. People can simply be told that their linter tool will not pick up magical __instancecheck__/__subclasscheck__ implementations. -Brett
On Aug 17, 2014, at 8:37 AM, Brett Cannon
On Sun Aug 17 2014 at 6:36:43 AM Andrey Vlasovskikh
wrote: There is a problem in static analysis of current types that are instances of abc.ABCMeta or types that just define their own __instancecheck__ / __subclasscheck__. Static analyzers cannot infer in general case what attributes of an instance / subclass do these methods check, because their body can be arbitrarily complex. That's only an issue if the type-checking code chooses to care about __instancecheck__/__subclasscheck__. The tool could choose to simply ignore those methods and treat them as a run-time only benefit for isinstance/issubclass checks but not for type checking. This is especially true if the check is being done on the AST instead of imported code. People can simply be told that their linter tool will not pick up magical __instancecheck__/__subclasscheck__ implementations.
+1 -- Best regards, Łukasz Langa WWW: http://lukasz.langa.pl/ Twitter: @llanga IRC: ambv on #python-dev
On Aug 17, 2014, at 3:35 AM, Andrey Vlasovskikh
There is a problem in static analysis of current types that are instances of abc.ABCMeta or types that just define their own __instancecheck__ / __subclasscheck__. Static analyzers cannot infer in general case what attributes of an instance / subclass do these methods check, because their body can be arbitrarily complex.
That's right. Moreover, arbitrary classes can be register()'ed on an ABC to respond to subclass and instance checks during runtime. Meta-classes and __new__ can do surprising things with returned class objects, too. In all cases Mypy would generate a false-positive type error. That's fine, we can improve on that in multiple ways. [1] People will put classes with ABCMeta in function annotations whether we design for it or not. Subclassing a MutableMapping is the easiest way to implement the whole protocol. As you say, __instancecheck__ and friends aren't limited to ABCs. People will put classes implementing those in function annotations, too. People will have classes with elaborate metaclasses as well. Most importantly: often those classes will come from libraries and frameworks that those people didn't write themselves. We can't expect them to open every black box to see if it works with type hinting. Lastly, classes may and will evolve, sometimes to grow more static and sometimes to become more dynamic. What I'm saying is that the typing module does not shield you from any of it. We need to define type hinting as a best-effort solution to set reasonable expectations. You need to understand the additional cognitive burden a new module like 'typing' this would impose on us and our users.
Mypy's typing.Protocol subclasses are much easier to analyze statically, since they are required to explicitly define abstract methods as function defintions inside the class body.
[1] My suggestions: - ABCMeta is special and we can reasonably improve on the runtime .register() invocations by scanning the source for those - a redefined __instancecheck__ would generate a warning - we can define a way for a Python program to say: this module is safe to be imported independently for executing __subclasscheck__ and __subclasshook__ checks on its classes - failing all of the above, this is a perfectly fine case for optional runtime type checking; I don't think anybody has the expectation that we will statically analyse their highly dynamic codebases All the above suggestions might be incremental. It's totally fine if we just begin with the last one. -- Best regards, Łukasz Langa WWW: http://lukasz.langa.pl/ Twitter: @llanga IRC: ambv on #python-dev
Andrey Vlasovskikh wrote:
There is a problem in static analysis of current types that are instances of abc.ABCMeta or types that just define their own __instancecheck__ / __subclasscheck__. Static analyzers cannot infer in general case what attributes of an instance / subclass do these methods check, because their body can be arbitrarily complex.
However, I'd be worried about having two very similar but subtly different sets of type objects floating around. That just seems like a recipe for massive confusion. -- Greg
On Sat, Aug 16, 2014 at 10:03:48PM -0700, Guido van Rossum wrote:
I'd like to summarize the main issues that have come up. As an experiment, I'm not changing the subject, but I am still not quoting anything in particular. Only two issues (or issue clusters) really seem contentious:
(1) Should the function annotation syntax (eventually) be reserved for type annotations in a standard syntax?
Reserved, as in alternatives are prohibited? No. But assumed to be types by default? I think so.
Or can multiple different uses of annotations coexist? And if they can, how should a specific use be indicated? (Also, some questions about compile-time vs. run-time use.)
Determining whether annotations should be interpreted as type annotations or not needs to be possible both at compile-time and run-time. I suggest a special decorator, imported from the typing module: from typing import skip_types # or some better name? @skip_types(marker) # see below for the purpose of marker @another_decorator # the order of decorators doesn't matter def function(x:"this is not a type")->[int]: ... At compile-time, static typing tools should treat any function decorated by skip_types as if it were dynamic. The skip_types decorator also writes marker to the __annotations__ dict, using a key which cannot be used as an identifier. Say, "+mark+". The purpose of the marker is so that other tools can determine whether the annotations are aimed at them. def handle_annotations(func): if func.__annotations__.get("+mark+", None) is MyMarker: do_stuff(func.__annotations__) else: pass Any time a function is not decorated by skip_types (or whatever name it has), or doesn't have that "+mark+" key in the __annotations__ dict, static type checking tools are entitled to assume the annotations are used for typing.
(2) For type annotations, should we adopt (roughly) the mypy syntax or the alternative proposed by Dave Halter? This uses built-in container notations as a shorthand, e.g. {str: int} instead of Dict[str, int]. This also touches on the issue of abstract vs. concrete types (e.g. iterable vs. list).
I prefer the mypy syntax.
Regarding (1), I continue to believe that we should eventually reserve annotations for types, to avoid confusing both humans and tools, but I think there's nothing we have to do in 3.5 -- 3.5 must preserve backward compatibility, and we're not proposing to give annotations any new semantics anyway -- the actual changes to CPython are limited to a new stdlib module (typing) and some documentation.
Perhaps a thornier issue is how mypy should handle decorators that manipulate the signature or annotations of the function they wrap. But I think the only reasonable answer here can be that mypy must understand what decorators do if it wants to have any chance at type-checking decorated functions.
Hmmm. Anything the decorators do to the annotations will be at runtime, so is it reasonable to say that the static typing tool will only operate on the annotations available at compile time? That is, given: @mangle_annotations def spam(x:int)->List[str]: ... the type checker is expected to use the annotations seen at compile-time, no matter what the mangle_annotations decorator happens to do at run-time. Otherwise, the type-checker needs to be a full Python interpreter, in order to see what mangle_annotations does. And that could be an intractible problem: def mangle_annotations(func): if random.random() < 0.5: func.__annotations__['x'] = List[str] else: func.__annotations__['x'] = float I don't see how any type-checker is supposed to take that into account. (1) Compile-time type checkers should ignore the decorator and just use the annotations available at compile-time; (2) Run-time analysis tools should use the annotations available at run-time; (3) If they happen to be different, oh well, consenting adults. -- Steven
On Sun, Aug 17, 2014 at 8:20 PM, Steven D'Aprano
Hmmm. Anything the decorators do to the annotations will be at runtime, so is it reasonable to say that the static typing tool will only operate on the annotations available at compile time?
That is, given:
@mangle_annotations def spam(x:int)->List[str]: ...
the type checker is expected to use the annotations seen at compile-time, no matter what the mangle_annotations decorator happens to do at run-time.
Otherwise, the type-checker needs to be a full Python interpreter, in order to see what mangle_annotations does. And that could be an intractible problem:
def mangle_annotations(func): if random.random() < 0.5: func.__annotations__['x'] = List[str] else: func.__annotations__['x'] = float
I don't see how any type-checker is supposed to take that into account.
You give an example of a malicious mangling, but more significant is the naive mangling - wrapping the decorated function in a non-annotated outer function, without using functools.wraps() or equivalent (I'm sure it'd be possible to propagate the annotations through wraps(), so that would take care of a lot of cases). IMO the right handling here is to completely ignore all unrecognized decorators, on the assumption that most decorators should be returning an "equivalently usable" function. I don't, for instance, see real-world examples of decorators that add extra parameters to a function, even though it would be plausible (maybe you have a whole bunch of functions that all take an optional mode parameter, which causes other arguments to be translated automatically by the decorator?). If you're annotating the function, the type checker can assume that that's intended to be correct. ChrisA
Chris Angelico wrote:
I don't, for instance, see real-world examples of decorators that add extra parameters to a function, even though it would be plausible
Some decorators, such as property(), don't return a function at all. Ignoring the decorator in that case would give completely the wrong idea. I'm now thinking the right thing to do with decorators is to analyse them statically if possible, otherwise treat the result as untyped. If a decorator is well-behaved, type inference should be able to propagate the types of the decorated function through to the result. If not, all bets are off. -- Greg
On Mon, Aug 18, 2014 at 9:43 AM, Greg Ewing
Chris Angelico wrote:
I don't, for instance, see real-world examples of decorators that add extra parameters to a function, even though it would be plausible
Some decorators, such as property(), don't return a function at all. Ignoring the decorator in that case would give completely the wrong idea.
So you don't annotate the function that handles the property. Simple! There may need to be some other handling of it (telling mypy that this property will always be a str, for instance), but the function's arguments aren't significant to the type system, so they don't need annotations. Same is true of anything else that decorates the function away altogether. ChrisA
2014-08-17, 9:03, Guido van Rossum
My first concern is that these expressions are only unambiguous in the context of function annotations. I want to promote the use of type aliases, and I think in general a type alias should behave similarly to an ABC. In particular, I think that any object used to represent a type in an annotation should itself be a type object (though you may not be able to instantiate it), and e.g. [int] doesn't satisfy that requirement. Without this, it would be difficult to implement isinstance() and issubclass() for type aliases -- and while we could special-case lists, sets and dicts, using a tuple *already* has a meaning!
Having type annotations as type objects sounds good. The fact that we can use isinstance() and issubclass() for all type annotations would provide some level of compatibility between static type checking and potential dynamic type checking: if "x: <type-expr>" then "isinstance(x, <type-expr>)". Note, that not all type annotations of mypy are currently type objects. Probably this should be fixed. -- Andrey Vlasovskikh Web: http://pirx.ru/
On Sun, Aug 17, 2014 at 12:33 AM, Guido van Rossum
(2) For type annotations, should we adopt (roughly) the mypy syntax or the alternative proposed by Dave Halter? This uses built-in container notations as a shorthand, e.g. {str: int} instead of Dict[str, int]. This also touches on the issue of abstract vs. concrete types (e.g. iterable vs. list).
Maybe we've been missing that in most programming languages the sub-language for describing types is separate from the sub-language for describing algorithms. It is so in Haskell, and in C, and in LISP, and many others. It should be OK if similar constructs mean different things in each sub-language, as the necessary symbol reuse (if one omits APL) come mostly from the IBM keyboard and ASCII. Personally, I'd like to type less, as is most often the case with current Python. Beyond that, I agree it is important that the adopted syntax does not mislead. Cheers, -- Juancarlo *Añez*
On Tue, Aug 19, 2014 at 2:18 PM, Juancarlo Añez
On Sun, Aug 17, 2014 at 12:33 AM, Guido van Rossum
wrote: (2) For type annotations, should we adopt (roughly) the mypy syntax or the alternative proposed by Dave Halter? This uses built-in container notations as a shorthand, e.g. {str: int} instead of Dict[str, int]. This also touches on the issue of abstract vs. concrete types (e.g. iterable vs. list).
Maybe we've been missing that in most programming languages the sub-language for describing types is separate from the sub-language for describing algorithms. It is so in Haskell, and in C, and in LISP, and many others.
It should be OK if similar constructs mean different things in each sub-language, as the necessary symbol reuse (if one omits APL) come mostly from the IBM keyboard and ASCII.
Personally, I'd like to type less, as is most often the case with current Python. Beyond that, I agree it is important that the adopted syntax does not mislead.
Actually, Python has a long tradition of reusing the same sub-language for things that are different sub-languages in other languages. Starting with 'int' being callable, in fact. :-) -- --Guido van Rossum (python.org/~guido)
On Tue, Aug 19, 2014 at 7:15 PM, Guido van Rossum
Actually, Python has a long tradition of reusing the same sub-language for things that are different sub-languages in other languages. Starting with 'int' being callable, in fact. :-)
I love the consistency of Python's type system. It's intuitive (difficult to get wrong). Maybe that's why I'm one of the fearful ones regarding this move towards formalizing type annotations. BTW, today I took the time to read Mr. C.D. Smith's (old) article about type systems, and I found it relevant, illustrative, smart, and entertaining. May I remind the list about it? I will: http://cdsmith.wordpress.com/2011/01/09/an-old-article-i-wrote/ -- Juancarlo *Añez*
On Sun, Aug 17, 2014 at 3:08 AM, Steven D'Aprano
I don't think this is a shining example of the value of static typing, at least not by default. As I see it, you would get something like this:
def __init__(self, description:str, sec_code:str, vendor_name:str, vendor_inv_num:str, vendor_rtng:str, vendor_acct:str, transaction_code:str, vendor_acct_type:str, amount:int, payment_date:Any)->None:
which may not give you much additional value. In this case, I think that the static checks will add nothing except (perhaps) allow you to forgo writing a few isinstance checks. You still have to check that the strings are the right length, and so on.
But if you're willing to invest some time creating individual str subclasses, you can push the length checks into the subclass constructor, and write something like this:
def __init__(self, description:Str10, sec_code:SecurityCode, vendor_name:Str22, vendor_inv_num:Str15, vendor_rtng:Str9, vendor_acct:Str17, transaction_code:ACH_ETC, vendor_acct_type:VendorAcctType, amount:Pennies, payment_date:DateABC)->None:
I know that the BDFL has spoken on this issue and said that he finds all of this readable and "pythonic", but these examples perfectly capture what I am going to dislike about this syntax as it becomes popular. I suppose it will be better when I am reading it in an editor that has syntax highlighting but as it stands I had to stare at that block of code for a long time to see how many and what type of arguments it called. At first I thought you had one per line, then I thought you had variable numbers per line. On about the fourth or fifth reading, I saw you had two per line. The problem with the syntax (I think) is that it relies on readers spotting characters like ":" and "[", characters which change how the eye should parse the line (assuming this is going to be optional). I find that those characters get lost very easily in long function definitions, leaving the reading having to read and re-read the block to answer questions like, 1. how many arguments are there? 2. Are any of them keyword arguments? Are they all the same type? 3. What are their names? In the example above, my eye keeps wanting to tell me that one of them is called SecurityCode, for example, even though I know that is the name of a class. This all seems unpythonic to me. Most of python's syntax is expressed in words rather than compact symbols. My fear with all of this is that it turns python into a language that is harder for humans to read. I much prefer the PyCharm docstring approach, because the eye can scan the function signature quickly and then the brain can say, "Ah - 10 arguments, oh, and I see that they have to be particular types and the return code is specified." To put it another way, current python function signatures are immediately intuitive even to someone who is unfamiliar with the language. There is nothing intuitive about this. It is more like looking at ObjectiveC or similar. At root, I don't totally understand what is "Pythonic" about function signatures. On the other hand, more expert people than me seem to like the above, and so I am sure that I am missing something. Perhaps it is simply the DRY principle. On the other hand, I am sure that readability issues are not simply a matter of personal taste. I've just re-read Tog on Inerterface, and perhaps that is colouring my thought! Anyone who is at all dyslexic is, I think, going to struggle! PEP8 should probably specify one argument per line if this kind of syntax is going to be at all re-readable. However, I do accept that the BDFL has spoken, and I'll "get with the program"! I'm sure I'll get used to it. N.
On Sun, Aug 17, 2014 at 08:41:33AM +0100, Nicholas Cole wrote:
On Sun, Aug 17, 2014 at 3:08 AM, Steven D'Aprano
wrote: [...] def __init__(self, description:Str10, sec_code:SecurityCode, vendor_name:Str22, vendor_inv_num:Str15, vendor_rtng:Str9, vendor_acct:Str17, transaction_code:ACH_ETC, vendor_acct_type:VendorAcctType, amount:Pennies, payment_date:DateABC)->None:
I know that the BDFL has spoken on this issue and said that he finds all of this readable and "pythonic", but these examples perfectly capture what I am going to dislike about this syntax as it becomes popular.
Even though I am in favour of the proposal, I do sympathise, and I see what you mean. But, I think it is important to realise that a method with ten arguments (plus self) is not going to be exactly readable at the best of times.
I suppose it will be better when I am reading it in an editor that has syntax highlighting but as it stands I had to stare at that block of code for a long time to see how many and what type of arguments it called. At first I thought you had one per line, then I thought you had variable numbers per line. On about the fourth or fifth reading, I saw you had two per line.
Look for the commas :-) But yes, as given that makes a big wall of text. I suppose it will take some time for people to decide what formatting works best for them. It might help to align the arguments in columns (even though that goes against PEP-8): def __init__(self, description:Str10, sec_code:SecurityCode, vendor_name:Str22, vendor_inv_num:Str15, vendor_rtng:Str9, vendor_acct:Str17, transaction_code:ACH_ETC, vendor_acct_type:VendorAcctType, amount:Pennies, payment_date:DateABC, ) -> None: That works for me.
My fear with all of this is that it turns python into a language that is harder for humans to read.
Declaring types in the function parameter list is very common, in many languages. If it were *that* much harder to read, languages wouldn't keep using it. (Not many languages follow Forth or APL syntax.) Perhaps because I learned to program in Pascal, I find the annotation syntax very easy to read, but, yes, anything which increases the density of information per line risks hurting readability a little. It's a tradeoff, and of course all of this is optional. Syntax highlighting will help, and I expect that in a few years time emacs and vim will have some way to hide annotations when editing code :-) -- Steven
On 17 August 2014 18:34, Steven D'Aprano
Declaring types in the function parameter list is very common, in many languages. If it were *that* much harder to read, languages wouldn't keep using it. (Not many languages follow Forth or APL syntax.) Perhaps because I learned to program in Pascal, I find the annotation syntax very easy to read, but, yes, anything which increases the density of information per line risks hurting readability a little.
I once had the "pleasure" of inheriting some code written in K&R style C, where the parameter type declarations were separate from the signature line: void foo(a, b, c) double a; char b; { ... } ANSI C, with inline typing, is far more readable :) When it comes to the readability of function headers with lots and lots of parameters... I'm in the "those are inherently unreadable, even if sometimes an unfortunate necessity" camp :) Reorganising-the-subprocess-module-docs-was-interesting'ly, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On Sun, Aug 17, 2014 at 9:50 AM, Nick Coghlan
On 17 August 2014 18:34, Steven D'Aprano
wrote: Declaring types in the function parameter list is very common, in many languages. If it were *that* much harder to read, languages wouldn't keep using it. (Not many languages follow Forth or APL syntax.) Perhaps because I learned to program in Pascal, I find the annotation syntax very easy to read, but, yes, anything which increases the density of information per line risks hurting readability a little.
I once had the "pleasure" of inheriting some code written in K&R style C, where the parameter type declarations were separate from the signature line:
void foo(a, b, c) double a; char b; { ... }
ANSI C, with inline typing, is far more readable :)
I think you've put your finger on it. It comes down to a disagreement over density of information. You'd like everything in "one pass" as it were. The way my brain is wired, I read your example here as: "This function takes three (non-named) parameters, the a is a double, the b is a char." I find that faster to process than the inline alternative, especially when the alternative is optional. With meaningful parameter names it would be even easier. If I re-write your example:
void foo(double a, void b, c)
My brain takes an extra fraction of a second to count the number of arguments. Syntax highlighting would help, of course. Some of this can be very subtle. For example, it's important for readability that in Python positional parameters are all specified, then keyword ones, so the brain doesn't have to keep switching backwards and forwards. In C, of course, everything has to be typed, and so I can see it makes sense to put it all inline. But what if you are mixing the two, and some are typed and some not? I think it is all going to get very dense and hard to read. The same tension occurs in natural languages. I tend to write quite dense English prose, myself. My editors always want me to write less densely, and over time I've come to see that they are right! Dense prose is fine for the specialist, but it doesn't help the student or the casual reader. N.
Nicholas Cole wrote:
On Sun, Aug 17, 2014 at 3:08 AM, Steven D'Aprano
wrote: def __init__(self, description:str, sec_code:str, vendor_name:str, vendor_inv_num:str, vendor_rtng:str, vendor_acct:str, transaction_code:str, vendor_acct_type:str, amount:int, payment_date:Any)->None:
I had to stare at that block of code for a long time to see how many and what type of arguments it called.
Pascal's function signature syntax had a nice feature that everyone else seems to have forgotten about. If you had multiple parameters of the same type, you only had to write the type once: procedure Init(description, sec_code, vendor_name, vendor_inv_num, vendor_rtng, vendor_acct, transaction_code, vendor_acct_type, amount: str; payment_date: Any) Disappointingly, Python's annotations make the same blunder as C, and most other languages since, in requiring each parameter to have its own individual annotation. -- Greg
Nimrod has that feature, too, which makes type lists easier on the eyes.
Greg Ewing
Nicholas Cole wrote:
On Sun, Aug 17, 2014 at 3:08 AM, Steven D'Aprano
wrote: def __init__(self, description:str, sec_code:str, vendor_name:str, vendor_inv_num:str, vendor_rtng:str, vendor_acct:str, transaction_code:str, vendor_acct_type:str, amount:int, payment_date:Any)->None:
I had to stare at that block of code for a long time to see how many and what type of arguments it called.
Pascal's function signature syntax had a nice feature that everyone else seems to have forgotten about. If you had multiple parameters of the same type, you only had to write the type once:
procedure Init(description, sec_code, vendor_name, vendor_inv_num, vendor_rtng, vendor_acct, transaction_code, vendor_acct_type, amount: str; payment_date: Any)
Disappointingly, Python's annotations make the same blunder as C, and most other languages since, in requiring each parameter to have its own individual annotation.
-- Greg _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
-- Sent from my Android phone with K-9 Mail. Please excuse my brevity.
Greg Ewing schrieb am 17.08.2014 um 12:31:
Nicholas Cole wrote:
On Sun, Aug 17, 2014 at 3:08 AM, Steven D'Aprano wrote:
def __init__(self, description:str, sec_code:str, vendor_name:str, vendor_inv_num:str, vendor_rtng:str, vendor_acct:str, transaction_code:str, vendor_acct_type:str, amount:int, payment_date:Any)->None:
I had to stare at that block of code for a long time to see how many and what type of arguments it called.
Pascal's function signature syntax had a nice feature that everyone else seems to have forgotten about. If you had multiple parameters of the same type, you only had to write the type once:
procedure Init(description, sec_code, vendor_name, vendor_inv_num, vendor_rtng, vendor_acct, transaction_code, vendor_acct_type, amount: str; payment_date: Any)
Disappointingly, Python's annotations make the same blunder as C, and most other languages since, in requiring each parameter to have its own individual annotation.
The difference is that Pascal requires type declarations whereas they are purely optional in Python. That makes the case that they are "missing" the right thing to optimise for, i.e. they should be explicit where they are and not take away space where they are not. Allowing argument sequences under a single type annotation would require some kind of marker for either that list or for the other arguments that are not typed. If you have a long argument list of, say, three positional arguments and ten optional keyword arguments, and you only want to annotate the first two positional arguments with types and leave the rest free, that's a lot nicer to express with two explicit type annotations than with grouped annotations and (potentially) explicit non-annotations. Stefan
On 2014-08-17 15:37, Stefan Behnel wrote:
Greg Ewing schrieb am 17.08.2014 um 12:31:
Nicholas Cole wrote:
On Sun, Aug 17, 2014 at 3:08 AM, Steven D'Aprano wrote:
def __init__(self, description:str, sec_code:str, vendor_name:str, vendor_inv_num:str, vendor_rtng:str, vendor_acct:str, transaction_code:str, vendor_acct_type:str, amount:int, payment_date:Any)->None:
I had to stare at that block of code for a long time to see how many and what type of arguments it called.
Pascal's function signature syntax had a nice feature that everyone else seems to have forgotten about. If you had multiple parameters of the same type, you only had to write the type once:
procedure Init(description, sec_code, vendor_name, vendor_inv_num, vendor_rtng, vendor_acct, transaction_code, vendor_acct_type, amount: str; payment_date: Any)
Disappointingly, Python's annotations make the same blunder as C, and most other languages since, in requiring each parameter to have its own individual annotation.
The difference is that Pascal requires type declarations whereas they are purely optional in Python. That makes the case that they are "missing" the right thing to optimise for, i.e. they should be explicit where they are and not take away space where they are not. Allowing argument sequences under a single type annotation would require some kind of marker for either that list or for the other arguments that are not typed. If you have a long argument list of, say, three positional arguments and ten optional keyword arguments, and you only want to annotate the first two positional arguments with types and leave the rest free, that's a lot nicer to express with two explicit type annotations than with grouped annotations and (potentially) explicit non-annotations.
I wonder whether you could include the colon but omit the type if it's the same as that of the following parameter: def __init__(self, description:, sec_code:, vendor_name:, vendor_inv_num:, vendor_rtng:, vendor_acct:, transaction_code:, vendor_acct_type:str, amount:int, payment_date:Any)->None:
On Sun, Aug 17, 2014 at 8:31 PM, Greg Ewing
Disappointingly, Python's annotations make the same blunder as C, and most other languages since, in requiring each parameter to have its own individual annotation.
Python doesn't really have any option here, because a non-annotated parameter already has meaning. But even in C-family languages, I'm not sure that it's all that advantageous; it makes editing less clear, so it's really only beneficial when you have sets of related arguments (eg "int r,g,b" to specify a color). With variable declarations, there's a difference between "int r,g,b; double x,y,z;", where the block of integers is terminated by a semicolon (and, conventionally, a line ending); in argument lists, you don't get that, so it's not as clear where one starts and one stops. (Imagine you misspell a type name. It's no longer a keyword. How will your mistake be reported?) ChrisA
Chris Angelico wrote:
With variable declarations, there's a difference between "int r,g,b; double x,y,z;", where the block of integers is terminated by a semicolon ... in argument lists, you don't get that,
Python's argument annotation syntax *could* have been defined to work the Pascal way, with semicolons separating the groups. But it wasn't, and it's too late to change now.
(Imagine you misspell a type name. It's no longer a keyword. How will your mistake be reported?)
I don't understand that. Type names are always clearly separated from keywords in either style (they come after a colon), so there's no danger of confusing them. -- Greg
participants (34)
-
Alexander Belopolsky
-
Andrew Barnert
-
Andrey Vlasovskikh
-
Antoine Pitrou
-
Ben Finney
-
Bill Winslow
-
Brett Cannon
-
Bruce Leban
-
Chris Angelico
-
Dave Halter
-
Dennis Brakhane
-
Devin Jeanpierre
-
Ethan Furman
-
Greg Ewing
-
Gregory P. Smith
-
Guido van Rossum
-
Haoyi Li
-
Juancarlo Añez
-
Jukka Lehtosalo
-
MRAB
-
Nicholas Cole
-
Nick Coghlan
-
Petr Viktorin
-
Ryan
-
Ryan Gonzalez
-
Ryan Hiebert
-
Skip Montanaro
-
Stefan Behnel
-
Steven D'Aprano
-
Sunjay Varma
-
Terry Reedy
-
Yann Kaiser
-
Zero Piraeus
-
Łukasz Langa