On Mon, Jan 18, 2021 at 4:34 PM Larry Hastings <larry@hastings.org> wrote:


On 1/18/21 2:39 PM, Guido van Rossum wrote:
Hm. It's unfortunate that this would break code using what is *currently* the best practice.

I can't figure out how to avoid it.  The problem is, current best practice sidesteps the class and goes straight to the dict.  How do we intercept that and run the code to lazy-calculate the annotations?

I mean, let's consider something crazy.  What if we change cls.__dict__ from a normal dict to a special dict that handles the __co_annotations__ machinery?  That might work, except, we literally allow users to supply their own cls.__dict__ via __prepare__.  So we can't rely on our special dict.


There's a secret though. `cls.__dict__` is not actually a dict -- is a mappingproxy. The proxy exists because we want to be able to intercept changes to class attributes such as `__add__` or `__getattribute__` in order to manipulate the C-level wrappers that implement such overloads.

So *perhaps* we could expand the mappingproxy class to trap read access to `__annotations__` as a key to do your bidding. (The trick might be exposed by things like .keys()  but that doesn't bother me as much.)

I honestly don't know how the mappingproxy and `__prepare__` interact. I have to admit I've never used the latter. Presumably the mappingproxy still plays a role because we'd still want to intercept e.g. `cls.__add__ = <some function>`.
 

What if we change cls.__dict__ to a getset?  The user is allowed to set cls.__dict__, but when you get __dict__, we wrap the actual internal dict object with a special object that intercepts accesses to __annotations__ and handles the __co_annotations__ mechanism.  That might work but it's really crazy and unfortunate.  And it's remotely possible that a user might override __dict__ as a property, in a way that breaks this mechanism too.  So it's not guaranteed to always work.


Maybe such guarantees are overrated; in any case it looks like a rare second-order effect (and we're already talking about esoteric usage patterns).

I'm not suggesting we should do these things, I'm just trying to illustrate how hard I think the problem is.  If someone has a good idea how we can add the __co_annotations__ machinery without breaking current best practice I'd love to hear it.


Also, for functions and modules I would recommend `getattr(o, "__annotations__", None)` (perhaps with `or {}` added).

For functions you don't need to bother; fn.__annotations__ is guaranteed to always be set, and be either a dict or None.  (Python will only ever set it to a dict, but the user is permitted to set it to None.)

I agree with your suggested best practice for modules as it stands today.

And actually, let me walk back something I've said before.  I believe I've said several times that "people treat classes and modules the same".  Actually that's wrong.

  • Lib/typing.py treats functions and modules the same; it uses getattr(o, '__annotations__', None).  It treats classes separately and uses cls.__dict__.get('__annotations__', {}).
  • Lib/dataclasses.py uses fn.__annotations__ for functions and cls.__dict__.get('__annotations__', {}) for classes.  It doesn't handle modules at all.
  • Lib/inspect.py calls Lib/typing.py to get annotations.  Which in retrospect I think is a bug, because annotations and type hints aren't the same thing.  (typing.get_type_hints changes None to type(None), it evaluates strings, etc).

So, for what it's worth, I literally have zero examples of people treating classes and modules the same when it comes to annotations.  Sorry for the confusion!


Yeah, that part felt fishy -- basically classes are the only complicated case here, because in order to construct the full set of annotations you must walk the MRO.

Honestly *if* you are walking the MRO anyways, it probably doesn't matter much if you use cls.__dict__.get('__annotations__') or getattr(cls, '__annotations__') -- you might see some duplicates but you should generally end up with the same overall set of annotations (though presumably one could construct a counter-example using multiple inheritance).


I would also honestly discount what dataclasses.py and typing.py have to do. But what do 3rd party packages do when they don't want to use get_type_hints() and they want to get it right for classes? That would give an indication of how serious we should take breaking current best practice.

I'm not sure how to figure that out.  Off the top of my head, the only current third-party packages I can think of that uses annotations are mypy and attrs.  I took a quick look at mypy but I can't figure out what it's doing.


Mypy is irrelevant because it reads your source code -- it doesn't ever run your code to inspect `__annotations__`.

attrs does something a little kooky.  It access __annotations__ using a function called _has_own_attributes(), which detects whether or not the object is inheriting an attribute.  But it doesn't peek in __dict__, instead it walks the mro and sees if any of its base classes have the same (non-False) value for that attribute.

https://github.com/python-attrs/attrs/blob/a025629e36440dcc27aee0ee5b04d6523bcc9931/src/attr/_make.py#L343

Happily, that seems like it would continue to work even if PEP 649 is accepted.  That's good news!


I wonder how much pain it cost to develop that.

Another example of a well-known library that presumably does something clever with annotations at runtime is Pydantic. I've not looked into it more.

There are people who routinely search many GitHub repos for various patterns. Maybe one of them can help? (I've never tried this but IIRC Irit showed me some examples.)

--
--Guido van Rossum (python.org/~guido)