PEP 563 and 649: The Great Compromise
The heart of the debate between PEPs 563 and 649 is the question: what should an annotation be? Should it be a string or a Python value? It seems people who are pro-PEP 563 want it to be a string, and people who are pro-PEP 649 want it to be a value. Actually, let me amend that slightly. Most people who are pro-PEP 563 don't actually care that annotations are strings, per se. What they want are specific runtime behaviors, and they get those behaviors when PEP 563 turns their annotations into strings. I have an idea--a rough proposal--on how we can mix together aspects of PEP 563 and PEP 649. I think it satisfies everyone's use cases for both PEPs. The behavior it gets us: * annotations can be stored as strings * annotations stored as strings can be examined as strings * annotations can be examined as values The idea: We add a new type of compile-time flag, akin to a "from __future__" import, but not from the future. Let's not call it "from __present__", for now how about "from __behavior__". In this specific case, we call it "from __behavior__ import str_annotations". It behaves much like Python 3.9 does when you say "from __future__ import annotations", except: it stores the dictionary with stringized values in a new member on the function/class/module called "__str_annotations__". If an object "o" has "__str_annotations__", set, you can access it and see the stringized values. If you access "o.__annotations__", and the object has "o.__str_annotations__" set but "o.__annotations__" is not set, it builds (and caches) a new dict by iterating over o.__str_annotations__, calling eval() on each value in "o.__str_annotations__". It gets the globals() dict the same way that PEP 649 does (including, if you compile a class with str_annotations, it sets __globals__ on the class). It does /not/ unset "o.__str_annotations__" unless someone explicitly sets "o.__annotations__". This is so you can write your code assuming that "o.__str_annotations__" is set, and it doesn't explode if somebody somewhere ever looks at "o.__annotations__". (This could lead to them getting out of sync, if someone modified "o.__annotations__". But I suspect practicality beats purity here.) This means: * People who only want stringized annotations can turn it on, and only ever examine "o.__str_annotations__". They get the benefits of PEP 563: annotations don't have to be valid Python values at runtime, just parseable. They can continue doing the "if TYPE_CHECKING:" import thing. * Library code which wants to examine values can examine "o.__annotations__". We might consider amending library functions that look at annotations to add a keyword-only parameter, "str_annotations=False", and if it's true it uses o.__str_annotations__ instead etc etc etc. Also, yes, of course we can keep the optimization where stringized annotations are stored as a tuple containing an even number of strings. Similarly to PEP 649's automatic binding of an unbound code object, if you set "o.__str_annotations__" to a tuple containing an even number of strings, and you then access "o.__str_annotations__", you get back a dict. TBD: how this interacts with PEP 649. I don't know if it means we only do this, or if it would be a good idea to do both this and 649. I just haven't thought about it. (It would be a runtime error to set both "o.__str_annotations__" and "o.__co_annotations__", though.) Well, whaddya think? Any good? I considered calling this "PEP 1212", which is 563 + 649, //arry/
El sáb, 17 abr 2021 a las 20:45, Larry Hastings (<larry@hastings.org>) escribió:
The heart of the debate between PEPs 563 and 649 is the question: what should an annotation be? Should it be a string or a Python value? It seems people who are pro-PEP 563 want it to be a string, and people who are pro-PEP 649 want it to be a value.
Actually, let me amend that slightly. Most people who are pro-PEP 563 don't actually care that annotations are strings, per se. What they want are specific runtime behaviors, and they get those behaviors when PEP 563 turns their annotations into strings.
I have an idea--a rough proposal--on how we can mix together aspects of PEP 563 and PEP 649. I think it satisfies everyone's use cases for both PEPs. The behavior it gets us:
- annotations can be stored as strings - annotations stored as strings can be examined as strings - annotations can be examined as values
The idea:
We add a new type of compile-time flag, akin to a "from __future__" import, but not from the future. Let's not call it "from __present__", for now how about "from __behavior__".
In this specific case, we call it "from __behavior__ import str_annotations". It behaves much like Python 3.9 does when you say "from __future__ import annotations", except: it stores the dictionary with stringized values in a new member on the function/class/module called "__str_annotations__".
If an object "o" has "__str_annotations__", set, you can access it and see the stringized values.
If you access "o.__annotations__", and the object has "o.__str_annotations__" set but "o.__annotations__" is not set, it builds (and caches) a new dict by iterating over o.__str_annotations__, calling eval() on each value in "o.__str_annotations__". It gets the globals() dict the same way that PEP 649 does (including, if you compile a class with str_annotations, it sets __globals__ on the class). It does *not* unset "o.__str_annotations__" unless someone explicitly sets "o.__annotations__". This is so you can write your code assuming that "o.__str_annotations__" is set, and it doesn't explode if somebody somewhere ever looks at "o.__annotations__". (This could lead to them getting out of sync, if someone modified "o.__annotations__". But I suspect practicality beats purity here.)
How would this work with annotations that access a local scope? def f(): class X: pass def inner() -> X: pass return innfer f().__annotations__ From your description it sounds like it would fail, just like calling typing.get_type_hints() would fail on it today. If so I don't see this as much better than the current situation: all it does is provide a builtin way of calling get_type_hints().
Hi Larry, all, I was thinking also of a compromise but a slightly different approach: Store annotations as a subclass of string but with the required frames attached to evaluate them as though they were in their local context. Then have a function "get_annotation_values" that knows how to evaluate these string subclasses with the attached frames. This would allow those who use runtime annotations to access local scope like PEP 649, and allow those who use static type checking to relax the syntax (as long as they don't try and evaluate the syntax at runtime) as per PEP 563. E.g this would work: def f(): class X: pass def inner() -> X: pass return innfer f().__annotations__ # Similar to PEP 563, values of dict look like strings get_annotation_values(f()) # Gets the same values as you would from accessing __annotations__ in PEP 649 Please ignore this idea if it doesn't make any sense, I'm working on limited technical knowledge of CPython. On Sun, Apr 18, 2021 at 10:09 AM Jelle Zijlstra <jelle.zijlstra@gmail.com> wrote:
El sáb, 17 abr 2021 a las 20:45, Larry Hastings (<larry@hastings.org>) escribió:
The heart of the debate between PEPs 563 and 649 is the question: what should an annotation be? Should it be a string or a Python value? It seems people who are pro-PEP 563 want it to be a string, and people who are pro-PEP 649 want it to be a value.
Actually, let me amend that slightly. Most people who are pro-PEP 563 don't actually care that annotations are strings, per se. What they want are specific runtime behaviors, and they get those behaviors when PEP 563 turns their annotations into strings.
I have an idea--a rough proposal--on how we can mix together aspects of PEP 563 and PEP 649. I think it satisfies everyone's use cases for both PEPs. The behavior it gets us:
- annotations can be stored as strings - annotations stored as strings can be examined as strings - annotations can be examined as values
The idea:
We add a new type of compile-time flag, akin to a "from __future__" import, but not from the future. Let's not call it "from __present__", for now how about "from __behavior__".
In this specific case, we call it "from __behavior__ import str_annotations". It behaves much like Python 3.9 does when you say "from __future__ import annotations", except: it stores the dictionary with stringized values in a new member on the function/class/module called "__str_annotations__".
If an object "o" has "__str_annotations__", set, you can access it and see the stringized values.
If you access "o.__annotations__", and the object has "o.__str_annotations__" set but "o.__annotations__" is not set, it builds (and caches) a new dict by iterating over o.__str_annotations__, calling eval() on each value in "o.__str_annotations__". It gets the globals() dict the same way that PEP 649 does (including, if you compile a class with str_annotations, it sets __globals__ on the class). It does *not* unset "o.__str_annotations__" unless someone explicitly sets "o.__annotations__". This is so you can write your code assuming that "o.__str_annotations__" is set, and it doesn't explode if somebody somewhere ever looks at "o.__annotations__". (This could lead to them getting out of sync, if someone modified "o.__annotations__". But I suspect practicality beats purity here.)
How would this work with annotations that access a local scope?
def f(): class X: pass def inner() -> X: pass return innfer
f().__annotations__
From your description it sounds like it would fail, just like calling typing.get_type_hints() would fail on it today.
If so I don't see this as much better than the current situation: all it does is provide a builtin way of calling get_type_hints(). _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/SZA3BUYW... Code of Conduct: http://python.org/psf/codeofconduct/
On 4/18/21 9:10 AM, Damian Shaw wrote:
Hi Larry, all, I was thinking also of a compromise but a slightly different approach:
Store annotations as a subclass of string but with the required frames attached to evaluate them as though they were in their local context. Then have a function "get_annotation_values" that knows how to evaluate these string subclasses with the attached frames.
This would allow those who use runtime annotations to access local scope like PEP 649, and allow those who use static type checking to relax the syntax (as long as they don't try and evaluate the syntax at runtime) as per PEP 563.
Something akin to this was proposed and discarded during the discussion of PEP 563, although the idea there was to still use actual Python bytecode instead of strings: https://www.python.org/dev/peps/pep-0563/#keeping-the-ability-to-use-functio... It was rejected because it would be too expensive in terms of resources. PEP 649's approach uses significantly fewer resources, which is one of the reasons it seems viable. Also, I don't see the benefit of requiring a function like "get_annotation_values" to see the actual values. This would force library code that examined annotations to change; I think it's better that we preserve the behavior that "o.__annotations__" are real values. Cheers, //arry/
I think there is a point to be made for requiring a function call to resolve annotations, in regard to the ongoing discussion about relaxing the annotation syntax (https://mail.python.org/archives/list/python-dev@python.org/message/2F5PVC5M...) Type annotation are still a fast moving topic compared to python as a whole. Should the annotation syntax be relaxed and annotations be stored as strings then requiring a function call to resolve annotations would allow third party libraries, be it typing_extensions or something else, to backport new type annotation syntax by offering their own version of "get_annotated_values". Typing features are already regularly backported using typing_extensions and this could not be done for new typing syntax unless annotations are stored as strings and resolved by a function. Note: Obviously new typing syntax couldn't be backported to versions before the typing syntax was relaxed, unless explicitly wrapped in a string, but I would imagine that if we see a relaxed annotation syntax we might see new typing syntax every now and then after that. Adrian Freund On April 18, 2021 6:49:59 PM GMT+02:00, Larry Hastings <larry@hastings.org> wrote:
On 4/18/21 9:10 AM, Damian Shaw wrote:
Hi Larry, all, I was thinking also of a compromise but a slightly different approach:
Store annotations as a subclass of string but with the required frames attached to evaluate them as though they were in their local context. Then have a function "get_annotation_values" that knows how to evaluate these string subclasses with the attached frames.
This would allow those who use runtime annotations to access local scope like PEP 649, and allow those who use static type checking to relax the syntax (as long as they don't try and evaluate the syntax at runtime) as per PEP 563.
Something akin to this was proposed and discarded during the discussion of PEP 563, although the idea there was to still use actual Python bytecode instead of strings:
https://www.python.org/dev/peps/pep-0563/#keeping-the-ability-to-use-functio...
It was rejected because it would be too expensive in terms of resources. PEP 649's approach uses significantly fewer resources, which is one of the reasons it seems viable.
Also, I don't see the benefit of requiring a function like "get_annotation_values" to see the actual values. This would force library code that examined annotations to change; I think it's better that we preserve the behavior that "o.__annotations__" are real values.
Cheers,
//arry/
You're right, losing visibility into the local scope (and outer function scopes) is part of why I suggest the behavior be compile-time selectable. The pro-PEP-563 crowd doesn't seem to care that 563 costs them visibility into anything but global scope; accepting this loss of visibility is part of the bargain of enabling the behavior. But people who don't need the runtime behavior of 563 don't have to live with it. As to only offering marginal benefit beyond typing.get_type_hints()--I think the benefit is larger than that. I realize now I should have gone into this topic in the original post; sorry, I kind of rushed through that. Let me fix that here. One reason you might not want to use typing.get_type_hints() is that it doesn't return /annotations/ generally, it specifically returns /type hints./ This is more opinionated than just returning the annotations, e.g. * None is changed to type(None). * Values are wrapped with Optional[] sometimes. * String annotations are wrapped with ForwardRef(). * If __no_type_check__ is set on the object, it ignores the annotations and returns an empty dict. I've already proposed addressing this for Python 3.10 by adding a new function to the standard library, probably to be called inspect.get_annotations(): https://bugs.python.org/issue43817 But even if you use this new function, there's still some murky ambiguity. Let's say you're using Python 3.9, and you've written a library function that examines annotations. (Again, specifically: annotations, not type hints.) And let's say the annotations dict contains one value, and it's the string "34". What should you do with it? If the module that defined it imported "from __future__ import annotations", then the actual desired value of the annotation was the integer 34, so you should eval() it. But if the module that defined it /didn't/ import that behavior, then the user probably wanted the string "34". How can you tell what the user intended? I think the only actual way to solve it would be to go rooting around in the module to see if you can find the future object. It's probably called "annotations". But it /is/ possible to compile with that behavior without the object being visible--it could be renamed, the module could have deleted it. Though these are admittedly unlikely. By storing stringized annotations in "o.__str_annotations__", we remove this ambiguity. Now we know for certain that these annotations were stringized and we should eval() them. And if a string shows up in "o.__annotations__" we know we should leave it alone. Of course, by making the language do the eval() on the strings, we abstract away the behavior completely. Now library code doesn't need to be aware if the module had stringized annotations, or PEP-649-style delayed annotations, or "stock" semantics. Accessing "o.__annotations__" always gets you the real annotations values, every time. Cheers, //arry/ On 4/18/21 7:06 AM, Jelle Zijlstra wrote:
El sáb, 17 abr 2021 a las 20:45, Larry Hastings (<larry@hastings.org <mailto:larry@hastings.org>>) escribió:
The heart of the debate between PEPs 563 and 649 is the question: what should an annotation be? Should it be a string or a Python value? It seems people who are pro-PEP 563 want it to be a string, and people who are pro-PEP 649 want it to be a value.
Actually, let me amend that slightly. Most people who are pro-PEP 563 don't actually care that annotations are strings, per se. What they want are specific runtime behaviors, and they get those behaviors when PEP 563 turns their annotations into strings.
I have an idea--a rough proposal--on how we can mix together aspects of PEP 563 and PEP 649. I think it satisfies everyone's use cases for both PEPs. The behavior it gets us:
* annotations can be stored as strings * annotations stored as strings can be examined as strings * annotations can be examined as values
The idea:
We add a new type of compile-time flag, akin to a "from __future__" import, but not from the future. Let's not call it "from __present__", for now how about "from __behavior__".
In this specific case, we call it "from __behavior__ import str_annotations". It behaves much like Python 3.9 does when you say "from __future__ import annotations", except: it stores the dictionary with stringized values in a new member on the function/class/module called "__str_annotations__".
If an object "o" has "__str_annotations__", set, you can access it and see the stringized values.
If you access "o.__annotations__", and the object has "o.__str_annotations__" set but "o.__annotations__" is not set, it builds (and caches) a new dict by iterating over o.__str_annotations__, calling eval() on each value in "o.__str_annotations__". It gets the globals() dict the same way that PEP 649 does (including, if you compile a class with str_annotations, it sets __globals__ on the class). It does /not/ unset "o.__str_annotations__" unless someone explicitly sets "o.__annotations__". This is so you can write your code assuming that "o.__str_annotations__" is set, and it doesn't explode if somebody somewhere ever looks at "o.__annotations__". (This could lead to them getting out of sync, if someone modified "o.__annotations__". But I suspect practicality beats purity here.)
How would this work with annotations that access a local scope?
def f(): class X: pass def inner() -> X: pass return innfer
f().__annotations__
From your description it sounds like it would fail, just like calling typing.get_type_hints() would fail on it today.
If so I don't see this as much better than the current situation: all it does is provide a builtin way of calling get_type_hints().
On Sat, Apr 17, 2021 at 8:46 PM Larry Hastings <larry@hastings.org> wrote:
The heart of the debate between PEPs 563 and 649 is the question: what should an annotation be? Should it be a string or a Python value? It seems people who are pro-PEP 563 want it to be a string, and people who are pro-PEP 649 want it to be a value.
Actually, let me amend that slightly. Most people who are pro-PEP 563 don't actually care that annotations are strings, per se. What they want are specific runtime behaviors, and they get those behaviors when PEP 563 turns their annotations into strings.
I have an idea--a rough proposal--on how we can mix together aspects of PEP 563 and PEP 649. I think it satisfies everyone's use cases for both PEPs. The behavior it gets us:
- annotations can be stored as strings - annotations stored as strings can be examined as strings - annotations can be examined as values
The idea:
We add a new type of compile-time flag, akin to a "from __future__" import, but not from the future. Let's not call it "from __present__", for now how about "from __behavior__".
In this specific case, we call it "from __behavior__ import str_annotations". It behaves much like Python 3.9 does when you say "from __future__ import annotations", except: it stores the dictionary with stringized values in a new member on the function/class/module called "__str_annotations__".
If an object "o" has "__str_annotations__", set, you can access it and see the stringized values.
If you access "o.__annotations__", and the object has "o.__str_annotations__" set but "o.__annotations__" is not set, it builds (and caches) a new dict by iterating over o.__str_annotations__, calling eval() on each value in "o.__str_annotations__". It gets the globals() dict the same way that PEP 649 does (including, if you compile a class with str_annotations, it sets __globals__ on the class). It does *not* unset "o.__str_annotations__" unless someone explicitly sets "o.__annotations__". This is so you can write your code assuming that "o.__str_annotations__" is set, and it doesn't explode if somebody somewhere ever looks at "o.__annotations__". (This could lead to them getting out of sync, if someone modified "o.__annotations__". But I suspect practicality beats purity here.)
This means:
- People who only want stringized annotations can turn it on, and only ever examine "o.__str_annotations__". They get the benefits of PEP 563: annotations don't have to be valid Python values at runtime, just parseable. They can continue doing the "if TYPE_CHECKING:" import thing. - Library code which wants to examine values can examine "o.__annotations__". We might consider amending library functions that look at annotations to add a keyword-only parameter, "str_annotations=False", and if it's true it uses o.__str_annotations__ instead etc etc etc.
I think this goes in the right direction.
Alternatively: what if the "trigger" to resolve the expression to an object was moved from a module-level setting to the specific expression? e.g. def foo(x: f'{list[int]}') -> f'{str}': bar: f'{tuple[int]}' = () @pydantic_or_whatever_that_needs_objects_from_annotations class Foo: blah: f'{tuple[int]}' = () I picked f-strings above since they're compatible with existing syntax and visible to the AST iirc; the point is some syntax/marker at the annotation level to indicate "eagerly resolve this / keep the value at runtime". Maybe "as", or ":@", or a "magic" @typing.runtime_annotations decorator, or some other bikeshed etc. (As an aside, Java deals with this problem by making its annotations compile-time only unless you mark them to be kept at runtime) The reasons I suggest this are: 1. A module-level switch reminds me of __future__.unicode_literals. Switching that on/off was a bit of a headache due to the action at a distance. 2. It's my belief that the *vast *majority of annotations are unused at runtime, so all the extra effort in resolving an annotation expression is just wasted cycles. It makes sense for the default behavior to be "string annotations", with runtime-evaluation/retention enabled when needed. 3. That said, there are definitely cases where having the resolved objects at runtime is very useful, and that should be enabled/allowed in a more "first-class" way. I think there's other benefits (when/how an error is reported, the explicitness, how that explicitness informs other tools, easier for libraries to use, avoids keeping closures around just in case, etc), but those three are the most significant IMHO. Also, yes, of course we can keep the optimization where stringized
annotations are stored as a tuple containing an even number of strings. Similarly to PEP 649's automatic binding of an unbound code object, if you set "o.__str_annotations__" to a tuple containing an even number of strings, and you then access "o.__str_annotations__", you get back a dict.
TBD: how this interacts with PEP 649. I don't know if it means we only do this, or if it would be a good idea to do both this and 649. I just haven't thought about it. (It would be a runtime error to set both "o.__str_annotations__" and "o.__co_annotations__", though.)
Well, whaddya think? Any good?
I considered calling this "PEP 1212", which is 563 + 649,
*/arry* _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/WUZGTGE4... Code of Conduct: http://python.org/psf/codeofconduct/
On 4/18/21 9:14 AM, Richard Levasseur wrote:
Alternatively: what if the "trigger" to resolve the expression to an object was moved from a module-level setting to the specific expression? e.g.
def foo(x: f'{list[int]}') -> f'{str}': bar: f'{tuple[int]}' = ()
@pydantic_or_whatever_that_needs_objects_from_annotations class Foo: blah: f'{tuple[int]}' = ()
I picked f-strings above since they're compatible with existing syntax and visible to the AST iirc; the point is some syntax/marker at the annotation level to indicate "eagerly resolve this / keep the value at runtime". Maybe "as", or ":@", or a "magic" @typing.runtime_annotations decorator, or some other bikeshed etc. (As an aside, Java deals with this problem by making its annotations compile-time only unless you mark them to be kept at runtime)
I genuinely don't understand what you're proposing. Could you elaborate? I will note however that your example adds a lot of instances of quoting and curly braces and the letter 'f'. Part of the reason that PEP 563 exists is that users of type hints didn't like quoting them all the time. Also, explicitly putting quotes around type hints means that Python didn't examine them at compile-time, so outright syntax errors would not be caught at compile-time. PEP 563 meant that syntax errors would be caught at compile-time. (Though PEP 563 still delays other errors, like NameError and ValueError, until runtime, the same way that PEP 649 does.)
The reasons I suggest this are:
1. A module-level switch reminds me of __future__.unicode_literals. Switching that on/off was a bit of a headache due to the action at a distance.
__future__.unicode_literals changed the default behavior of strings so that they became Unicode. An important part of my proposal is that it minimizes the observable change in behavior at runtime. PEP 563 changes "o.__annotations__" so that it contains stringized annotations, my proposal changes that so it returns real values, assuming eval() succeeds. What if the eval() fails, with a NameLookup or whatever? Yes, this would change observable behavior. Without the compile-time flag enabled, the annotation fails to evaluate correctly at import time. With the compile-time flag enabled, the annotation fails to evaluate correctly at the time it's examined. I think this is generally a feature anyway. As you observe in the next paragraph, the vast majority of annotations are unused at runtime. If a program didn't need an annotation at runtime, then making it succeed at import time for something it doesn't care about seems like a reasonable change in behavior. The downside is, nested library code might make it hard to determine which object had the bad annotation, though perhaps we can avoid this by crafting a better error message for the exception.
2. It's my belief that the /vast /majority of annotations are unused at runtime, so all the extra effort in resolving an annotation expression is just wasted cycles. It makes sense for the default behavior to be "string annotations", with runtime-evaluation/retention enabled when needed.
The conversion is lazy. If the annotation is never examined at runtime, it's left in the state the compiler defined it in. Where does it waste cycles? Cheers, //arry/
On Sun, Apr 18, 2021 at 10:12 AM Larry Hastings <larry@hastings.org> wrote:
On 4/18/21 9:14 AM, Richard Levasseur wrote:
Alternatively: what if the "trigger" to resolve the expression to an object was moved from a module-level setting to the specific expression? e.g.
I genuinely don't understand what you're proposing. Could you elaborate?
I can't speak for Richard, but Interpreted this as: Have a way to specify, when you write the annotation, whether you want it evaluated or kept as a string. in my previous post, I added the idea of the semantics (am I using that work right?) as meaning "run-time" vs "non-run time (type check time)" -- that is, do you want this to be a valid value that can be used at run time? But it could only mean "stringify or not".
I will note however that your example adds a lot of instances of quoting and curly braces and the letter 'f'. Part of the reason that PEP 563 exists is that users of type hints didn't like quoting them all the time.
I think Richard suggested the f-string because it's currently legal syntax. And you'd get syntax checking for anything in the brackets. But we could come up with a nicer notation, maybe a double colon ? class A: x: int # you can stringify this one y:: int # please don't stringify this one maybe that's too subtle, but in this cse, maybe subtle is good -- to the reader of the code they mean pretty much the same thing. To the writer, they are quite different, but in a very testable way. And this would preserve:
PEP 563 meant that syntax errors would be caught at compile-time.
It would also open the door to extending the syntax for typing as is also being discussed. Granted, adding yet more syntax to Python nis a big deal, but maybe not as big a deal as adding another dunder, or removing functionality. Also, could a __past__ import or some such be introduced to make the new syntax legal in older supported versions of Python? As I think about this, I like this idea more and more. There are three groups of folks using annotations: 1) The Static Type Checking folks: This is a large and growing and important use case, and they want PEP 563, or something like it. 2) The "annotations ARE type objects" folks --this is a much smaller group, at least primarily -- a much larger group is using that functionality perhaps without realizing it, via Pydantic and the like, but the folks actually writing that code are more select. 3) I think BY FAR the largest group: Folks using type annotations primarily as documentation. (evidenced by a recent paper that whent through PyPi and found a LOT of code that used type annotations that apparently was not using a Type checker) So it seems a good goal would be: Have nothing change for group 3 -- the largest and probably paying the least attention to all this. Have things work well for group 1 -- Type Checking seems to be of growing importance. Require only a small manageable update for group 2 -- important, but a smaller group of folks that would actually have to change code. (hmm.. maybe not -- not many people write libraries like Pydantic, but all the users of those libraries would need to update their type annotations :-( ) -CHB -- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
On Sun, Apr 18, 2021 at 10:49 AM Christopher Barker <pythonchb@gmail.com> wrote:
On Sun, Apr 18, 2021 at 10:12 AM Larry Hastings <larry@hastings.org> wrote:
On 4/18/21 9:14 AM, Richard Levasseur wrote:
Alternatively: what if the "trigger" to resolve the expression to an object was moved from a module-level setting to the specific expression? e.g.
I genuinely don't understand what you're proposing. Could you elaborate?
I can't speak for Richard, but Interpreted this as:
Have a way to specify, when you write the annotation, whether you want it evaluated or kept as a string.
Yes, exactly.
in my previous post, I added the idea of the semantics (am I using that work right?) as meaning "run-time" vs "non-run time (type check time)" -- that is, do you want this to be a valid value that can be used at run time? But it could only mean "stringify or not".
I will note however that your example adds a lot of instances of quoting and curly braces and the letter 'f'. Part of the reason that PEP 563 exists is that users of type hints didn't like quoting them all the time.
I think Richard suggested the f-string because it's currently legal syntax. And you'd get syntax checking for anything in the brackets.
Yes, exactly. And f-strings are generally understood to be a string whose contents are "immediately evaluated" (or thereabouts), which is basically what we're asking for in these cases of annotations. But yeah, as Larry points out, it's not exactly pretty. Something like double-colon (or :@, or : as, or some such) is more aesthetically appealing.
But we could come up with a nicer notation, maybe a double colon ?
class A: x: int # you can stringify this one y:: int # please don't stringify this one
maybe that's too subtle, but in this cse, maybe subtle is good -- to the reader of the code they mean pretty much the same thing. To the writer, they are quite different, but in a very testable way.
And this would preserve:
PEP 563 meant that syntax errors would be caught at compile-time.
It would also open the door to extending the syntax for typing as is also being discussed.
Granted, adding yet more syntax to Python nis a big deal, but maybe not as big a deal as adding another dunder, or removing functionality.
Also, could a __past__ import or some such be introduced to make the new syntax legal in older supported versions of Python?
As I think about this, I like this idea more and more. There are three groups of folks using annotations:
1) The Static Type Checking folks: This is a large and growing and important use case, and they want PEP 563, or something like it.
2) The "annotations ARE type objects" folks --this is a much smaller group, at least primarily -- a much larger group is using that functionality perhaps without realizing it, via Pydantic and the like, but the folks actually writing that code are more select.
3) I think BY FAR the largest group: Folks using type annotations primarily as documentation. (evidenced by a recent paper that whent through PyPi and found a LOT of code that used type annotations that apparently was not using a Type checker)
So it seems a good goal would be:
Have nothing change for group 3 -- the largest and probably paying the least attention to all this.
Have things work well for group 1 -- Type Checking seems to be of growing importance.
Require only a small manageable update for group 2 -- important, but a smaller group of folks that would actually have to change code. (hmm.. maybe not -- not many people write libraries like Pydantic, but all the users of those libraries would need to update their type annotations :-( )
-CHB
-- Christopher Barker, PhD (Chris)
Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
Thanks Larry, This seems to be going in a good direction. I do note another motivator -- folks don't want annotations to hurt run-time performance, particularly when they are being used primarily for pre-run-time static type checking. Which makes sense. And PEP 649 seems to affect perfromance. Which brings me to: On Sun, Apr 18, 2021 at 9:24 AM Richard Levasseur <richardlev@gmail.com> wrote:
Java deals with this problem by making its annotations compile-time only
unless you mark them to be kept at runtime)
Maybe we could borrow that as well -- annotations could be marked to not be used at run time at all. That could be the default.
2. It's my belief that the *vast *majority of annotations are unused at runtime, so all the extra effort in resolving an annotation expression is just wasted cycles. It makes sense for the default behavior to be "string annotations", with runtime-evaluation/retention enabled when needed.
exactly. And I would prefer that the specification as to whether this was a "stringified" or not annotation at the annotation level, rather than at the module level. It's quite conceivable that a user might want to use "run time" annotations, in say, the specification of a Pydantic class, and annotations just for type checking elsewhere in the module. and then would we need the
"o.__str_annotations__"
attribute? or even PEP 649 at all? What you'd end up with is __annotations__ potentially containing both objects and strings that could be type objects -- which currently works fine:
In [20]: class A: ...: x: "int" ...: y: int ...: In [21]: A.__annotations__ Out[21]: {'x': 'int', 'y': int} In [22]: typing.get_type_hints(A) Out[22]: {'x': int, 'y': int} Then the only thing that would change with PEP 563 is the default behaviour. If I'm not mistaken, the complexity (and performance hit) of dealing with the whole could be string, could be object, evaluate the string process would be in the type checkers, where performance is much less of an issue. -CHB -- Christopher Barker, PhD (Chris) Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython
Christopher Barker wrote:
... folks don't want annotations to hurt run-time performance, particularly when they are being used primarily for pre-run-time static type checking.
So are we looking for three separate optimization levels at compile time? Base level: evaluate everything, keep it however the original annotations PEP said to keep it. String level: Characters in an annotation are not evaluated; they just get stored in a string. The string (rather than its value) is then kept however the original annotations PEP said things would be kept. Removal level: Annotations are used only with source code (perhaps by static analyzers before compilation); they are dropped entirely during compilation. This might go well with the old compilation mode that drops docstrings. -jJ
On 4/17/2021 11:43 PM, Larry Hastings wrote:
The heart of the debate between PEPs 563 and 649 is the question: what should an annotation be? Should it be a string or a Python value? It seems people who are pro-PEP 563 want it to be a string, and people who
are pro-PEP 649 want it to be a value.
Actually, let me amend that slightly. Most people who are pro-PEP 563 don't actually care that annotations are strings, per se. What they want are specific runtime behaviors, and they get those behaviors when PEP 563 turns their annotations into strings.
I have an idea--a rough proposal--on how we can mix together aspects of
PEP 563 and PEP 649. I think it satisfies everyone's use cases for both PEPs. The behavior it gets us:
* annotations can be stored as strings * annotations stored as strings can be examined as strings * annotations can be examined as values
I agree that somehow satisfying more people rather than fewer is a good goal.
The idea:
We add a new type of compile-time flag, akin to a "from __future__" import, but not from the future. Let's not call it "from __present__", for now how about "from __behavior__".
I have wondered whether the current split over annotations constitutes sufficient 'necessity' to introduce a new entity. The proposal for permanent Python dialects, as opposed to temporary dialects, is not new, as indicated by your reference to '__present__'. It has so far been rejected. Some people, upon having their syntax proposal rejected, have proposed that it be made an option. I believe that others, not liking a accepted change, have proposed that the old way be kept as an option, or that the new way be optional. So I think that we should first look at other ways to meet the goal, as proposed in other responses. -- Terry Jan Reedy
On 4/17/21 8:43 PM, Larry Hastings wrote:
TBD: how this interacts with PEP 649. I don't know if it means we only do this, or if it would be a good idea to do both this and 649. I just haven't thought about it. (It would be a runtime error to set both "o.__str_annotations__" and "o.__co_annotations__", though.)
I thought about it some, and I think PEP 649 would still be a good idea, even if this "PEP 1212" proposal (or a variant of it) was workable and got accepted. PEP 649 solves the forward references problem for most users without the restrictions of PEP 563 (or "PEP 1212"). So most people wouldn't need to turn on the "PEP 1212" behavior. Cheers, //arry/
Hi Larry, This is a creative option, but I am optimistic (now that the SC decision has removed the 3.10 deadline urgency) that we can find a path forward that is workable for everyone and doesn't require a permanent compiler feature flag and a language that is permanently split-brained about annotation semantics. Since I have access to a real-world large codebase with almost complete adoption of type annotations (and I care about its import performance), I'm willing to test PEP 649 on it (can't commit to doing it right away, but within the next month or so) and see how much import performance is impacted, and how much of that can be gained back by interning tweaks as discussed in the other thread. My feeling is that if the performance turns out to be reasonably close in a real codebase, and we can find a workable solution for `if TYPE_CHECKING`, we should go ahead with PEP 649: IMO aside from those two issues its semantics are a better fit for the rest of the language and preferable to PEP 563. I do think that a solution to the `if TYPE_CHECKING` problem should be identified as part of PEP 649. My favorite option there would be a new form of import that is lazy (the import does not actually occur until the imported name is loaded at runtime). This has prior art in previous discussions about "module descriptors"; IIRC Neil Schemenauer even had a branch a while ago where all module attributes were modified to behave this way (I might be remembering the details wrong.) It also has overlap with use cases served by the existing `demandimport` library used by hg, and `importlib.util.LazyLoader`, although it is strictly more capable because it can work with `from module import Thing` style imports as well. If there's any interest in this as a solution to inter-module annotation forward references, I'm also willing to work on that in the 3.11 timeframe. Carl
On Sat., Apr. 24, 2021, 20:55 Carl Meyer, <carl@oddbird.net> wrote:
Hi Larry,
This is a creative option, but I am optimistic (now that the SC decision has removed the 3.10 deadline urgency) that we can find a path forward that is workable for everyone and doesn't require a permanent compiler feature flag and a language that is permanently split-brained about annotation semantics. Since I have access to a real-world large codebase with almost complete adoption of type annotations (and I care about its import performance), I'm willing to test PEP 649 on it (can't commit to doing it right away, but within the next month or so) and see how much import performance is impacted, and how much of that can be gained back by interning tweaks as discussed in the other thread.
Thanks for the kind offer, Carl! I know I would find it useful in evaluating PEP 649 is we had a real-world perf evaluation like you're offering. My feeling is that if the performance
turns out to be reasonably close in a real codebase, and we can find a workable solution for `if TYPE_CHECKING`, we should go ahead with PEP 649: IMO aside from those two issues its semantics are a better fit for the rest of the language and preferable to PEP 563.
I do think that a solution to the `if TYPE_CHECKING` problem should be identified as part of PEP 649. My favorite option there would be a new form of import that is lazy (the import does not actually occur until the imported name is loaded at runtime). This has prior art in previous discussions about "module descriptors"; IIRC Neil Schemenauer even had a branch a while ago where all module attributes were modified to behave this way (I might be remembering the details wrong.)
Nope, you're remembering right; it was Neil. I think he started looking at this topic at the core dev sprints when they were hosted at Microsoft (2018?). It also has overlap with use cases served by the existing
`demandimport` library used by hg, and `importlib.util.LazyLoader`,
I'm not sure if it's diverged, but he's solution was originally a copy of importlib.util.LazyLoader, so the approach was the same. although it is strictly more capable because it can work with `from
module import Thing` style imports as well. If there's any interest in this as a solution to inter-module annotation forward references, I'm also willing to work on that in the 3.11 timeframe.
I know I would be curious, especially if backwards compatibility can be solved reasonably (for those that haven't lived this, deferred execution historically messes up code relying on import side-effects and trackbacks are weird as they occur at access time instead of at the import statement). -Brett
Carl _______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-leave@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/VBG2LXU6... Code of Conduct: http://python.org/psf/codeofconduct/
On Sun, Apr 25, 2021 at 10:30 AM Brett Cannon <brett@python.org> wrote:
I know I would be curious, especially if backwards compatibility can be solved reasonably (for those that haven't lived this, deferred execution historically messes up code relying on import side-effects and trackbacks are weird as they occur at access time instead of at the import statement).
I had been assuming that due to backward compatibility and performance of `LOAD_GLOBAL`, this would need to be a new form of import, syntactically distinguished. But the performance and some of the compatibility concerns could be alleviated by making all imports deferred by default, and then resolving any not-yet-resolved imports at the end of module execution. This is perhaps even better for the non-typing case, since it would generally fix most import cycle problems in Python. (It would be sort of equivalent to moving all imports that aren't needed for module execution to the end of the module, which is another ugly but effective workaround for cycles.) It would have the downside that type-only imports which will never be needed at runtime at all will still be imported, even if `__annotations__` are never accessed. I think it's still problematic for backward compatibility with import side effects, though, so if we did this at the very least it would have to be behind a `__future__` import. Carl
participants (10)
-
Adrian Freund
-
Brett Cannon
-
Carl Meyer
-
Christopher Barker
-
Damian Shaw
-
Jelle Zijlstra
-
Jim J. Jewett
-
Larry Hastings
-
Richard Levasseur
-
Terry Reedy