On Apr 7, 2020, at 15:31, Christopher Barker <pythonchb@gmail.com> wrote:

On Tue, Apr 7, 2020 at 11:17 AM Wes Turner <wes.turner@gmail.com> wrote:
Would you generate a schema from the type annotations so that other languages can use the data?

I haven't done this yet, but it would be pretty cool.

This seems like one of the many things that’s impossible to do for Python classes with full generality, but pretty easy to do if you only want to support @dataclass. Either dynamically or statically, in fact. You could even write a tool that generates dataclasses (statically or dynamically) from a schema, if you wanted.

But what is meant by "type annotations"? I'm using them via dataclasses -- really as a shorthand for assigning a type to every field -- the annotations are just a shorthand that auto-generates a schema, essentially.

class MyClass:
    x: A_Type = A_default

So now I know that this class has a field names x that is the type int. So I use that type for validation, and serialization / deserialization.

But if you mean "type annotations" in the sense of the types provided out of the box in the typing module and used by MyPy (so I've heard, never did it myself) -- no, they are not sufficient -- I need types that support my serialization / deserialization system, and my validation system. And I suppose we could have a standardized __json__ and __from_json__ protocol that I could use, but it seems a special case to me.

A type annotation is just the “: whatever”. It doesn’t matter whether that whatever is a real dynamic type like int or spam.Eggs or list, or a typing.type like List[int], it’s still an annotation. And dataclass can handle either—change that to “x: List[A_Type]” and at runtime, the dataclass will treat that the same as if you just used list, but mypy can know that it’s only supposed to have A_Type members in that list. (So if you initialize MyClass([1, “spam”, open(“eggs.txt”)]) it’ll work at runtime, but mypy will flag it as a type error.)

An “automated JSON serialization for dataclasses” library could do either. It could be “dumb” like @dataclass and just treat x as a list of anything serializable, or it could be “smart” and treat it as a list of A_Type objects only. Either way seems like it makes sense.

A schema generator for dataclasses, I think you’d want it to use the typing information. A MyClass property a doesn’t just have an attribute of {“type”: “array”}; it has a {“type”: “array”, “contains”: recursively_schematize(A_Type)}.