Making class attributes first class citizens

Hello. I've been working on an ORM and I've realized the typing tools we have aren't really well suited to this use case. Let's see if we can start improving this. For example, it's a challenge to find an ORM that can do statically checked query expressions, or statically checked projections (when you want to load only a subset of your table fields, as a tuple, say). I think SQLModel comes closest at this time with some undocumented magic. I have a small proposal to start, and it's about class attributes. Essentially every library in this space has you model your table/collection as a class, kind of like this: ``` class MyModel: my_int_field: int ``` Sometimes they make you inherit from something, sometimes there's a metaclass involved, sometimes decorators, sometimes there are magic descriptors. At the *core* of each, though, there are 3 entities: * the class itself (MyModel) * the instance attribute `my_int_field` * the class attribute `my_int_field` The class attribute is the one we need to improve/standardize, since it's the one with the weakest typing support. In practice there are 3 ways of getting to the class attribute: 1) getting it directly from the class object - `MyModel.my_int_field`. This is what the SQLAlchemy ORM and SQLModel do 2) getting it from a magic namedtuple attribute on the class - `MyModel.c.my_int_field`. This is what SQLAlchemy Core does, and technically attrs and dataclass (the field is ugly and hidden) 3) getting it from a function that returns a magic namedtuple - fields(MyModel).my_int_field. This is what attrs and dataclasses expose (although for dataclass it's not a namedtuple yet) I personally think 1) is bad for composability (two libraries can't use the model at the same time since they can't both own `MyModel.my_int_field`, the class can't be slotted without heavy magic) but it's very popular in the wild. My proposal is this: we figure out a way for libraries to export their own generic attribute type, and we have type checkers automatically parametrize it with the actual type information from the class. This could be kind of an extension to the dataclass_transform PEP. That PEP establishes generic support for declarative classes, we now just build on it with a system for custom class attributes. Here is what a potential direction for this might look like. I'm going to use the attrs way since its existing functionality is very close to this. `attrs.fields(MyModel)` already returns a namedtuple of (attr.Attribute[int],). We add an optional attribute to `fields`, `AttributeCls`. SQLAlchemy implements its own attribute type, `SQLAlchemyAttribute[T]`. `attrs.fields(MyModel, AttributeCls=SQLAlchemyAttribute)` now returns a namedtuple of (sqlalchemy.SQLAlchemyAttribute[int],). SQLAlchemy defines its own `fields` function for ergonomics: `c = partial(fields, AttributeCls=SQLAlchemyAttribute)`. Since SQLAlchemy has its own attribute type now, that attribute type is free to override `__eq__` so users can do queries with it: `c(MyModel).my_int == 5`, and since the attribute is generic, this can be statically checked. My library can define its own attribute type, `TinAttribute[T]`. Since my library targets a different database, I support a different API: `f(MyModel).my_int.gt(5).lt(10)`. An extension to the protocol could be made so we support special decorators, like the dataclass_transform PEP. So a decorator could exist that would signal to the type checker that these attributes are injected either into the class namespace itself, or a magic class attribute namedtuple like `c`. So a SQLAlchemy Core version could be: ``` sqlalchemy_decorator = partial(fields_decorator, attribute_cls=SQLAlchemyAttribute, magic_attribute=c) @sqlalchemy_decorator class MyModel: my_int: int ``` which would tell the type checker that `MyModel.c.my_int` is an instance of `SQLAlchemyAttribute[int]`. If you don't set the `magic_attribute` value, it would tell the typechecker the attributes get injected into the class object instead, so you'd get the SQLAlchemy ORM and SQLModel behavior. I don't think this solves everything (libraries might still need descriptors for other purposes, like maybe dirtiness tracking), but I think it may be a good start. Since I mentioned statically typed projections, this allows those too (although with a lot of overloads). My query function can be `async def query(MyModel, projection: tuple[TinAttribute[T]]) -> tuple[T]`. I think a system like this would be a good start to improving type safety in these kinds of libraries.

Since you don't mention it, perhaps you haven't seen PEP 681 yet? On Tue, Apr 12, 2022 at 5:04 AM Tin Tvrtković <tinchester@gmail.com> wrote:
Hello.
I've been working on an ORM and I've realized the typing tools we have aren't really well suited to this use case. Let's see if we can start improving this.
For example, it's a challenge to find an ORM that can do statically checked query expressions, or statically checked projections (when you want to load only a subset of your table fields, as a tuple, say). I think SQLModel comes closest at this time with some undocumented magic.
I have a small proposal to start, and it's about class attributes. Essentially every library in this space has you model your table/collection as a class, kind of like this:
``` class MyModel: my_int_field: int ``` Sometimes they make you inherit from something, sometimes there's a metaclass involved, sometimes decorators, sometimes there are magic descriptors. At the *core* of each, though, there are 3 entities:
* the class itself (MyModel) * the instance attribute `my_int_field` * the class attribute `my_int_field`
The class attribute is the one we need to improve/standardize, since it's the one with the weakest typing support. In practice there are 3 ways of getting to the class attribute: 1) getting it directly from the class object - `MyModel.my_int_field`. This is what the SQLAlchemy ORM and SQLModel do 2) getting it from a magic namedtuple attribute on the class - `MyModel.c.my_int_field`. This is what SQLAlchemy Core does, and technically attrs and dataclass (the field is ugly and hidden) 3) getting it from a function that returns a magic namedtuple - fields(MyModel).my_int_field. This is what attrs and dataclasses expose (although for dataclass it's not a namedtuple yet)
I personally think 1) is bad for composability (two libraries can't use the model at the same time since they can't both own `MyModel.my_int_field`, the class can't be slotted without heavy magic) but it's very popular in the wild.
My proposal is this: we figure out a way for libraries to export their own generic attribute type, and we have type checkers automatically parametrize it with the actual type information from the class. This could be kind of an extension to the dataclass_transform PEP. That PEP establishes generic support for declarative classes, we now just build on it with a system for custom class attributes.
Here is what a potential direction for this might look like. I'm going to use the attrs way since its existing functionality is very close to this.
`attrs.fields(MyModel)` already returns a namedtuple of (attr.Attribute[int],). We add an optional attribute to `fields`, `AttributeCls`. SQLAlchemy implements its own attribute type, `SQLAlchemyAttribute[T]`. `attrs.fields(MyModel, AttributeCls=SQLAlchemyAttribute)` now returns a namedtuple of (sqlalchemy.SQLAlchemyAttribute[int],). SQLAlchemy defines its own `fields` function for ergonomics: `c = partial(fields, AttributeCls=SQLAlchemyAttribute)`.
Since SQLAlchemy has its own attribute type now, that attribute type is free to override `__eq__` so users can do queries with it: `c(MyModel).my_int == 5`, and since the attribute is generic, this can be statically checked.
My library can define its own attribute type, `TinAttribute[T]`. Since my library targets a different database, I support a different API: `f(MyModel).my_int.gt(5).lt(10)`.
An extension to the protocol could be made so we support special decorators, like the dataclass_transform PEP. So a decorator could exist that would signal to the type checker that these attributes are injected either into the class namespace itself, or a magic class attribute namedtuple like `c`. So a SQLAlchemy Core version could be:
``` sqlalchemy_decorator = partial(fields_decorator, attribute_cls=SQLAlchemyAttribute, magic_attribute=c)
@sqlalchemy_decorator class MyModel: my_int: int ``` which would tell the type checker that `MyModel.c.my_int` is an instance of `SQLAlchemyAttribute[int]`. If you don't set the `magic_attribute` value, it would tell the typechecker the attributes get injected into the class object instead, so you'd get the SQLAlchemy ORM and SQLModel behavior.
I don't think this solves everything (libraries might still need descriptors for other purposes, like maybe dirtiness tracking), but I think it may be a good start.
Since I mentioned statically typed projections, this allows those too (although with a lot of overloads). My query function can be `async def query(MyModel, projection: tuple[TinAttribute[T]]) -> tuple[T]`.
I think a system like this would be a good start to improving type safety in these kinds of libraries. _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: guido@python.org
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

On Tue, Apr 12, 2022, at 1:16 PM, Guido van Rossum wrote:
Since you don't mention it, perhaps you haven't seen PEP 681 yet?
I think it refers to the "dataclass_transform" pep which would be pep 681. From my point of view as the SQLAlchemy maintainer, I'm still lacking a means of producing arbitrary tuples derived from other tuples, for which I had hoped pep-646 could accommodate. Per Eric Traut at https://github.com/python/typing/discussions/1001#discussioncomment-1897813 there would need to be a means of providing a "map" operation over the types of a TypeVarTuple to work with such tuples generically. I'm not yet following Tin's proposal and how this gets me from a ORMMappedTuple[Mapped[int], Mapped[str]] to being able to extract a ResultRowTuple[int, str], but it seems like pep-646 would need to be involved somehow. that said, I might be derailing a bit here so I'll just leave it at that for now :)
On Tue, Apr 12, 2022 at 5:04 AM Tin Tvrtković <tinchester@gmail.com> wrote:
Hello.
I've been working on an ORM and I've realized the typing tools we have aren't really well suited to this use case. Let's see if we can start improving this.
For example, it's a challenge to find an ORM that can do statically checked query expressions, or statically checked projections (when you want to load only a subset of your table fields, as a tuple, say). I think SQLModel comes closest at this time with some undocumented magic.
I have a small proposal to start, and it's about class attributes. Essentially every library in this space has you model your table/collection as a class, kind of like this:
``` class MyModel: my_int_field: int ``` Sometimes they make you inherit from something, sometimes there's a metaclass involved, sometimes decorators, sometimes there are magic descriptors. At the *core* of each, though, there are 3 entities:
* the class itself (MyModel) * the instance attribute `my_int_field` * the class attribute `my_int_field`
The class attribute is the one we need to improve/standardize, since it's the one with the weakest typing support. In practice there are 3 ways of getting to the class attribute: 1) getting it directly from the class object - `MyModel.my_int_field`. This is what the SQLAlchemy ORM and SQLModel do 2) getting it from a magic namedtuple attribute on the class - `MyModel.c.my_int_field`. This is what SQLAlchemy Core does, and technically attrs and dataclass (the field is ugly and hidden) 3) getting it from a function that returns a magic namedtuple - fields(MyModel).my_int_field. This is what attrs and dataclasses expose (although for dataclass it's not a namedtuple yet)
I personally think 1) is bad for composability (two libraries can't use the model at the same time since they can't both own `MyModel.my_int_field`, the class can't be slotted without heavy magic) but it's very popular in the wild.
My proposal is this: we figure out a way for libraries to export their own generic attribute type, and we have type checkers automatically parametrize it with the actual type information from the class. This could be kind of an extension to the dataclass_transform PEP. That PEP establishes generic support for declarative classes, we now just build on it with a system for custom class attributes.
Here is what a potential direction for this might look like. I'm going to use the attrs way since its existing functionality is very close to this.
`attrs.fields(MyModel)` already returns a namedtuple of (attr.Attribute[int],). We add an optional attribute to `fields`, `AttributeCls`. SQLAlchemy implements its own attribute type, `SQLAlchemyAttribute[T]`. `attrs.fields(MyModel, AttributeCls=SQLAlchemyAttribute)` now returns a namedtuple of (sqlalchemy.SQLAlchemyAttribute[int],). SQLAlchemy defines its own `fields` function for ergonomics: `c = partial(fields, AttributeCls=SQLAlchemyAttribute)`.
Since SQLAlchemy has its own attribute type now, that attribute type is free to override `__eq__` so users can do queries with it: `c(MyModel).my_int == 5`, and since the attribute is generic, this can be statically checked.
My library can define its own attribute type, `TinAttribute[T]`. Since my library targets a different database, I support a different API: `f(MyModel).my_int.gt(5).lt(10)`.
An extension to the protocol could be made so we support special decorators, like the dataclass_transform PEP. So a decorator could exist that would signal to the type checker that these attributes are injected either into the class namespace itself, or a magic class attribute namedtuple like `c`. So a SQLAlchemy Core version could be:
``` sqlalchemy_decorator = partial(fields_decorator, attribute_cls=SQLAlchemyAttribute, magic_attribute=c)
@sqlalchemy_decorator class MyModel: my_int: int ``` which would tell the type checker that `MyModel.c.my_int` is an instance of `SQLAlchemyAttribute[int]`. If you don't set the `magic_attribute` value, it would tell the typechecker the attributes get injected into the class object instead, so you'd get the SQLAlchemy ORM and SQLModel behavior.
I don't think this solves everything (libraries might still need descriptors for other purposes, like maybe dirtiness tracking), but I think it may be a good start.
Since I mentioned statically typed projections, this allows those too (although with a lot of overloads). My query function can be `async def query(MyModel, projection: tuple[TinAttribute[T]]) -> tuple[T]`.
I think a system like this would be a good start to improving type safety in these kinds of libraries. _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: guido@python.org
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...> _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: mike_mp@zzzcomputing.com

On Tue, Apr 12, 2022 at 8:40 PM Mike Bayer <mike_mp@zzzcomputing.com> wrote:
From my point of view as the SQLAlchemy maintainer, I'm still lacking a means of producing arbitrary tuples derived from other tuples, for which I had hoped pep-646 could accommodate. Per Eric Traut at https://github.com/python/typing/discussions/1001#discussioncomment-1897813 there would need to be a means of providing a "map" operation over the types of a TypeVarTuple to work with such tuples generically.
I'm not yet following Tin's proposal and how this gets me from a ORMMappedTuple[Mapped[int], Mapped[str]] to being able to extract a ResultRowTuple[int, str], but it seems like pep-646 would need to be involved somehow.
This proposal provides a generic way for you and me to get a Mapped[int], with the `int` part being correct and slot classes being possible. The thing you're asking about is a separate issue that should also be addressed. I'm just brainstorming this first.

My apologies, you did mention PEP 681, as " the dataclass_transform PEP". Since I'm not very familiar with ORMs and such, I'm not sure how your proposal relates to that PEP; perhaps you're seeking an extension of it? Or do you merely find it a good model for API design for your use case? On Tue, Apr 12, 2022 at 10:16 AM Guido van Rossum <guido@python.org> wrote:
Since you don't mention it, perhaps you haven't seen PEP 681 yet?
On Tue, Apr 12, 2022 at 5:04 AM Tin Tvrtković <tinchester@gmail.com> wrote:
Hello.
I've been working on an ORM and I've realized the typing tools we have aren't really well suited to this use case. Let's see if we can start improving this.
For example, it's a challenge to find an ORM that can do statically checked query expressions, or statically checked projections (when you want to load only a subset of your table fields, as a tuple, say). I think SQLModel comes closest at this time with some undocumented magic.
I have a small proposal to start, and it's about class attributes. Essentially every library in this space has you model your table/collection as a class, kind of like this:
``` class MyModel: my_int_field: int ``` Sometimes they make you inherit from something, sometimes there's a metaclass involved, sometimes decorators, sometimes there are magic descriptors. At the *core* of each, though, there are 3 entities:
* the class itself (MyModel) * the instance attribute `my_int_field` * the class attribute `my_int_field`
The class attribute is the one we need to improve/standardize, since it's the one with the weakest typing support. In practice there are 3 ways of getting to the class attribute: 1) getting it directly from the class object - `MyModel.my_int_field`. This is what the SQLAlchemy ORM and SQLModel do 2) getting it from a magic namedtuple attribute on the class - `MyModel.c.my_int_field`. This is what SQLAlchemy Core does, and technically attrs and dataclass (the field is ugly and hidden) 3) getting it from a function that returns a magic namedtuple - fields(MyModel).my_int_field. This is what attrs and dataclasses expose (although for dataclass it's not a namedtuple yet)
I personally think 1) is bad for composability (two libraries can't use the model at the same time since they can't both own `MyModel.my_int_field`, the class can't be slotted without heavy magic) but it's very popular in the wild.
My proposal is this: we figure out a way for libraries to export their own generic attribute type, and we have type checkers automatically parametrize it with the actual type information from the class. This could be kind of an extension to the dataclass_transform PEP. That PEP establishes generic support for declarative classes, we now just build on it with a system for custom class attributes.
Here is what a potential direction for this might look like. I'm going to use the attrs way since its existing functionality is very close to this.
`attrs.fields(MyModel)` already returns a namedtuple of (attr.Attribute[int],). We add an optional attribute to `fields`, `AttributeCls`. SQLAlchemy implements its own attribute type, `SQLAlchemyAttribute[T]`. `attrs.fields(MyModel, AttributeCls=SQLAlchemyAttribute)` now returns a namedtuple of (sqlalchemy.SQLAlchemyAttribute[int],). SQLAlchemy defines its own `fields` function for ergonomics: `c = partial(fields, AttributeCls=SQLAlchemyAttribute)`.
Since SQLAlchemy has its own attribute type now, that attribute type is free to override `__eq__` so users can do queries with it: `c(MyModel).my_int == 5`, and since the attribute is generic, this can be statically checked.
My library can define its own attribute type, `TinAttribute[T]`. Since my library targets a different database, I support a different API: `f(MyModel).my_int.gt(5).lt(10)`.
An extension to the protocol could be made so we support special decorators, like the dataclass_transform PEP. So a decorator could exist that would signal to the type checker that these attributes are injected either into the class namespace itself, or a magic class attribute namedtuple like `c`. So a SQLAlchemy Core version could be:
``` sqlalchemy_decorator = partial(fields_decorator, attribute_cls=SQLAlchemyAttribute, magic_attribute=c)
@sqlalchemy_decorator class MyModel: my_int: int ``` which would tell the type checker that `MyModel.c.my_int` is an instance of `SQLAlchemyAttribute[int]`. If you don't set the `magic_attribute` value, it would tell the typechecker the attributes get injected into the class object instead, so you'd get the SQLAlchemy ORM and SQLModel behavior.
I don't think this solves everything (libraries might still need descriptors for other purposes, like maybe dirtiness tracking), but I think it may be a good start.
Since I mentioned statically typed projections, this allows those too (although with a lot of overloads). My query function can be `async def query(MyModel, projection: tuple[TinAttribute[T]]) -> tuple[T]`.
I think a system like this would be a good start to improving type safety in these kinds of libraries. _______________________________________________ Typing-sig mailing list -- typing-sig@python.org To unsubscribe send an email to typing-sig-leave@python.org https://mail.python.org/mailman3/lists/typing-sig.python.org/ Member address: guido@python.org
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

I think this could be a separate PEP that builds on that PEP, and I also think it's a good model. I think this functionality naturally builds on the concepts in PEP 681. On Tue, Apr 12, 2022 at 9:42 PM Guido van Rossum <guido@python.org> wrote:
My apologies, you did mention PEP 681, as " the dataclass_transform PEP". Since I'm not very familiar with ORMs and such, I'm not sure how your proposal relates to that PEP; perhaps you're seeking an extension of it? Or do you merely find it a good model for API design for your use case?
participants (3)
-
Guido van Rossum
-
Mike Bayer
-
Tin Tvrtković