On 4/17/2021 12:28 AM, Christopher Barker wrote:
I wonder if anyone has considered the impact of PEP 563 on dataclasses ?
I have!

I did find this SO post:

https://stackoverflow.com/questions/52171982/new-annotations-break-inspection-of-dataclasses

which is related, but not quite the same -- THAT issue was fixed.

My issue does show up in this BPO:

https://bugs.python.org/issue39442

Somewhere in the middle of that thread, Eric wrote: "At this point, this seems more like fodder for python-ideas."

Then there's a bit more to the thread, which peters out without resolution.

On to the issue:

dataclasses may be the only part of the stdlib that uses annotations.

dataclasses store the annotations in Field objects type attribute, in the __dataclass_fields__ attribute. 

We can see that with this code:

@dataclass
class Simple:
    x : int
    y : float
    name : str

s = Simple(3, 4.0, "fred")

print("The field types:")
for f in s.__dataclass_fields__.values():
    print(f"name: {f.name}, type: {f.type}, type of type: {type(f.type)}")


Which results in:

The field types:
name: x, type: <class 'int'>, type of type: <class 'type'>
name: y, type: <class 'float'>, type of type: <class 'type'>
name: name, type: <class 'str'>, type of type: <class 'type'>

with:

from __future__ import annotations

The result is:

The field types:
name: x, type: int, type of type: <class 'str'>
name: y, type: float, type of type: <class 'str'>
name: name, type: str, type of type: <class 'str'>


This of course is completely as expected.

I have no idea how dataclasses uses the Field type attribute -- as far as I can tell, for nothing at all. However, it is there, and it is called "type", rather than say, "annotation".
In retrospect, that field probably should have been named "annotation". Regardless, the intent was always "store what's in __annotations__[field_name]", or what the user specified in field(..., type=whatever, ...).

And I have a whole pile of code that fully expects the Fields' type attribute to be an actual type object that I can call to crate an instance of that type (or call a method on it, which is what I am actually doing)

So my code will very much break with this change.
True, unfortunately. To be clear to everyone not paying close attention, "this change" is PEP 563.

I fully understand that the __dataclass_fields__ attribute was probably never intended to be part of the public API, so I get what I deserve. 

However, the Field object is documented, as such:

"""
class dataclasses.Field

Field objects describe each defined field. These objects are created internally, and are returned by the fields() module-level method (see below). Users should never instantiate a Field object directly. Its documented attributes are:

name: The name of the field.
type: The type of the field.
default, default_factory, init, repr, hash, compare, and metadata have the identical meaning and values as they do in the field() declaration.

Other attributes may exist, but they are private and must not be inspected or relied on.
"""

That last sentence implies that the type attribute CAN be inspected and relied upon, which is what I am doing.
Yes, Field.type is very much part of the public dataclasses API as available through dataclasses.fields(), not through cls.__dataclass_fields__.

And I haven't tried to solve this problem in my use case, but I'm not sure it can be done -- when I get around to inspecting the type of the Field objects, I have no idea what namespace they are in -- so I can't reconstruct them from the string. I really need the type object itself.
@dataclass pretty much has the same problem with respect to calling typing.get_type_hints().

So I'll probably need to re-write much of the dataclasses decorator, to call eval() early -- though even then I'm not sure I'll have access to the proper namespace.

Anyway -- one more data point:  PEP 563 changes the (semi-public?) API of dataclasses.

Though *maybe* that could be addressed with a dataclasses update -- again, I've only started to think about it -- there was some discussion of that in the BPO, though Eric didn't seem particularly interested.

I still think it's working as intended: it uses what's in __annotations__ (or passed in to field()). As everyone who has tried to call typing.get_type_hints() knows, it's hard to get right, if it's even possible, because, as you say "when I get around to inspecting the type of the Field objects, I have no idea what namespace they are in". My opinion is that the person who wants a real type object probably has a better idea of that namespace than @dataclass does, but there's a very good chance they don't know, either.

@dataclass goes out of its way to not call typing.get_type_hints(). The original reason for this is not wanting to force typing to be imported, if it wasn't already being used. That may have been addressed with PEP 560, but I've never really checked on the impact.

Another reason for not calling typing.get_type_hints(): there's really only one thing [*] dataclasses wants to know, with regard to the type/annotation of the field: is the type of this field typing.ClassVar? It doesn't seem that the performance issues and possible failures make it worth calling typing.get_type_hints() just for this case. @dataclass uses other tricks (not described here).

In any event, all of this mess is one reason I'd like to see PEP 649 get accepted: there would never be a reason to call typing.get_type_hints(), and the values in the Field object would again be type objects.

Back to my original point: If you ignore the test for ClassVar, then dataclasses completely ignores the values in __annotations__ or Field.type. It's no different from typing.NamedTuple in that regard.

I do have sympathy for users looking at Field.type and getting a string instead of a type object: but that's really no different from non-dataclass users looking at any occurrence of __annotations__ and now getting a string: that's a result of PEP 563 across the board, not just with dataclasses. As I said in bpo-39442:

Isn't that the entire point of "from __future__ import annotations"?

Eric

[*]: Actually two things: the other being "is the field of type dataclasses.InitVar?". It has some of the same problems as ClassVar, but we know that dataclasses has been imported, so it's less of a big deal.