On Tue, Jan 12, 2021 at 6:35 PM Larry Hastings <larry@hastings.org> wrote:

On 1/12/21 5:28 PM, Brett Cannon wrote:
The other thing to keep in mind is we are talking about every module, class, and function getting 64 bytes ... which I bet isn't that much.

Actually it's only every module and class.  Functions don't have this problem because they've always stored __annotations__ internally--meaning, peeking in their __dict__ doesn't work, and they don't support inheritance anyway.  So the number is even smaller than that.

If we can just make __annotations__ default to an empty dict on classes and modules, and not worry about the memory consumption, that goes a long way to cleaning up the semantics.

I would like that very much. And the exception for functions is especially helpful.

And I know you were somewhat joking when you mentioned using sys.version_info, but since this would be behind a __future__ import

Would it?

My original proposal would make breaking changes to how you examine __annotations__.  Let's say we put those behind a from __future__ import.  Now we're gonna write library code that examines annotations.  A user passes in a class and asks us to examine its annotations.  The old semantics might be active on it, or the new ones.  How do we know which set of semantics we need to use?

It occurs to me that you could take kls.__module__, pull out the module from sys.modules, then look inside to see if it contains the correct "future" object imported from the __future__ module.  Is that an approach we would suggest to our users?

You're kidding, right?

Also, very little code ever examines annotations; most code with annotations merely defines them.  So I suspect most annotations users wouldn't care either way--which also means a "from __future__ import" that changes the semantics of examining or modifying annotations isn't going to see a lot of uptake, because it doesn't really affect them.  The change in semantics only affects people whose code examines annotations, which I suspect is very few.

I agree, but they're pretty vocal -- the breakage in get_type_hints() due to the scope issue in 3.10 (which isn't even in beta) has drawn plenty of complaints.

Also, dataclasses (which I have to assume is fairly popular :-) introspects `__annotations__`, and even mutates and sets it.

So I wasn't really joking when I proposed making these changes without a from __future__ import, and suggested users use a version check.  The library code would know based on the Python version number which semantics were active, no peeking in modules to find future object.  They could literally write what I suggested:

if you know you're running python 3.10 or higher:
    examine using the new semantics
    examine using the old semantics

I realize that's a pretty aggressive approach, which is why I prefaced it with "if I could wave my magic wand".  But if we're going to make breaking changes, then whatever we do, it's going to break some people's code until it gets updated to cope with the new semantics.  In that light this approach seemed reasonable.

Is there a way that such code could be written without a version check? E.g. for modules we could recommend `getattr(m, "__attributes__", None) or {}`, and that would work in earlier versions too.

I'm not sure what would work for classes, since most code will want to combine the annotations for all classes in the MRO, and the way to do that would change -- before 3.10, you *must* use `cls.__dict__.get("__attributes__")` whereas for 3.10+ you *must* use `cls.__attributes__`.

Note, for a moment I thought that for modules we don't need to evaluate annotations lazily (I know that's your other PEP/thread, but still, it seems related). But we do, because there's an idiom where people write
from __future__ import annotations
import typing
if typing.TYPE_CHECKING:
   from somewhere import Class
a: Class
Here introspecting the annotations would fail, but clearly the intention was to use them purely for static type checking, so the user presumably doesn't care.

(But does that mean that if a single annotation cannot be evaluated, the entire annotations dict becomes inaccessible? That's a general weakness of the PEP 649 scheme, right?)

But really this is why I started this thread in the first place.  My idea of what's reasonable is probably all out of whack.  So I wanted to start the conversation, to get feedback on how much breakage is allowable and how best to mitigate it.  If it wasn't a controversial change, then we wouldn't need to talk about it!

And finally: if we really do set a default of an empty dict on classes and modules, then my other in-theory breaking changes:

  • you can't delete __annotations__
  • you can only set __annotations__ to a dict or None (this is already true of functions, but not of classes or modules)

will, I expect, in practice breaking exactly zero code.  Who deletes __annotations__?  Who ever sets __annotations__ to something besides a dict?  So if the practical breakage is zero, why bother gating it with "from __future__ import" at all?

Maybe for the benefit of users who rely on some specific library that gets the annotations out of a class dict. The library could document "don't use that future annotations because then your annotations won't work" which would give that library a few releases time to come up with an alternative strategy.

I think it really means people need to rely on typing.get_type_hints() more than they may be doing right now.

What I find frustrating about that answer--and part of what motivated me to work on this in the first place--is that typing.get_type_hints() requires your annotations to be type hints.  All type hints are annotations, but not all annotations are type hints, and it's entirely plausible for users to have reasonable uses for non-type-hint annotations that typing.get_type_hints() wouldn't like.

The two things typing.get_type_hints() does, that I know of, that can impede such non-type-hint annotations are:

  • It turns a None annotation into type(None).  Which means now you can't tell the difference between "None" and "type(None)".
  • It regards all string annotations as "forward references", which means they get eval()'d and the result returned as the annotation.  typing.get_type_hints() doesn't catch any exceptions here, so if the eval fails, typing.get_type_hints() fails and you can't use it to examine your annotations.

PEP 484 "explicitly does NOT prevent other uses of annotations".  But if you force everyone to use typing.get_type_hints() to examine their annotations, then you have de facto prevented any use of annotations that isn't compatible with type hints.

I suspect that the most common use of annotation introspection is to implement some kind of runtime type checking scheme (there are many of those, I think even JSON schema verifiers based on typing.TypedDict) and those users would presumably be fine with get_type_hints().

Note that PEP 593 introduces a way to attach arbitrary extra data to an annotation, e.g.
UnsignedShort = Annotated[int, struct2.ctype('H')]
name: Annotated[str, struct2.ctype("<10s")]

--Guido van Rossum (python.org/~guido)