> user_projection: tuple[str] = await fetch_projection(User, id=1, fields(User)[1])
> This can't really be very type-safe since Mypy treats `fields(User)[1]` as `dataclasses.Field*[Any]`. Now, if Mypy treated it as `dataclasses.Field*[str]`, I assume that would be a different story, and the function could be annotated to return a 1-tuple of `str`.

Typecheckers would have to special-case `dataclasses.fields(Foo)` to return a tuple of Fields from the specific dataclass. So, `fields(User)` would have type `Tuple[Field[int], Field[str]]`. In your example, `fields(User)[1]` would  have type `Field[str]`.

This approach would work when projecting a single field or a fixed number of fields.

# Variadic tuple + Map

To support projecting *arbitrary* numbers of fields, we can use variadic tuples (PEP 646 [1]) and the `Map` operator (to be introduced in a follow-up PEP).

As a concrete example, sqlalchemy's `Session.query` accepts arbitrary columns (or classes) and returns a Query object. A Query object is basically an iterator of tuples.

Example of `query` from the sqlalchemy docs [2]:

class User(Base):
    __tablename__ = 'users'
    id = Column(Integer, Sequence('user_id_seq'), primary_key=True)
    name = Column(String(50))
    fullname = Column(String(50))
    nickname = Column(String(50))

>>> for name, fullname in session.query(User.name, User.fullname):
...     print(name, fullname)
ed Ed Jones
wendy Wendy Williams
mary Mary Contrary
fred Fred Flintstone

We could type the `query` function as follows:

# Generic alias to capture the type of a class or a column (i.e., a field).
ClassOrColumn = Type[T] | Column[T]

Ts = TypeVarTuple("Ts")

class Session:
    def query(self, *args: *Map[ClassOrColumn, Ts]) -> Query[Tuple[*Ts]]: ...

# (1)
# => Column[str]
# actually Column[Optional[str]], but keeping it simple here

# => Column[str]

session.query(User.name, Query.fullname)
# => Query[Tuple[str, str]]

# For the above function call, the `query` function behaves as if it were the following:
def query(self, entity1: ClassOrColumn[T1], entity2: ClassOrColumn[T2]) -> Query[Tuple[T1, T2]]: ...

Step-by-step explanation:

+ Because `query` is given two arguments, `Ts` is seen as a tuple of two TypeVars: `Tuple[T1, T2]`.
+ The `Map[ClassOrColumn, Ts]` maps `ClassOrColumn` over each element of `Tuple[T1, T2]` to give `Tuple[ClassOrColumn[T1], ClassOrColumn[T2]]`.
+ Finally, using `*args: *<some_tuple>` means that it will accept arguments corresponding to the tuple.
+ So, `*args: *Tuple[ClassOrColumn[T1], ClassOrColumn[T2]]` means it will accept two arguments, one of type `ClassOrColumn[T1]` and another of `ClassOrColumn[T2]`.
+ That binds `T1` to `str` and `T2` to `str`, giving a return type of `Query[Tuple[str, str]]`.

Other examples follow similarly:

# (2)
session.query(User, User.name, User.fullname)
# => Query[Tuple[User, str, str]]

We can adapt the above approach to the ORM/ODM projection functions you had in mind. But, first, both PEP 646 and the yet-to-be-submitted `Map` PEP have to be accepted :)

If you're interested in these developments, you could attend the monthly "Tensor typing" meetings announced on this list or read the meeting minutes [3].

[1]: https://www.python.org/dev/peps/pep-0646/

[2]: https://docs.sqlalchemy.org/en/14/orm/tutorial.html

Note that `query` omits the tuple when there is just one argument. So, we'd need an `overload` for that case.

[3]: https://mail.python.org/archives/list/typing-sig@python.org/message/52SAAWTGHLV4AJYAHTG5BAWUOUT3EDBK/

On Fri, Apr 23, 2021 at 10:18 AM Tin Tvrtković <tinchester@gmail.com> wrote:
Dear typing-sig,

I've noticed recently that there is practically no support for type-safe ORM/ODM projections in the broader Python ecosystem.

A little context: ORMs (object-relational mapper) and ODM (object-document mapper/mapping) are tools or libraries for interacting with databases. SQLAlchemy and the Django ORM are probably the most famous examples. Basically, you model a database table as a class, and rows in the table are instances of that class. ODMs are a little simpler than ORMs, they practically only map rows/documents to instances of a class, ORMs are more complex.

Writing a type-safe ODM nowadays is super simple, and there are loads out there. The idea is this:

class User:
    id: int
    username: str

user = await fetch(User, id=1)

This can be made to work with the proper type annotations without much fuss.

Now, the issue is doing projections in a type-safe manner. A projection basically means loading only a subset of the fields, usually for performance. Let's say that we're only interested in the user username (imagine there are 30 other fields in the class that we don't care about). That would look kinda like:

from dataclasses import fields

user_projection: tuple[str] = await fetch_projection(User, id=1, fields(User)[1])

This can't really be very type-safe since Mypy treats `fields(User)[1]` as `dataclasses.Field*[Any]`. Now, if Mypy treated it as `dataclasses.Field*[str]`, I assume that would be a different story, and the function could be annotated to return a 1-tuple of `str`.

As mentioned, I've looked at a bunch of ORM/ODM libraries (and written a few internally) and as far as I can tell this use-case is very much unsupported as of yet. (I would appreciate counter-examples, obviously!) This made me sad so I came to the list to see if there's anything to be done.

To be super honest I'm more interested in getting attrs support for this, but attrs and dataclasses are so similar I figured if someone did the work in the dataclass plugin, the logic could also be ported over to the attrs plugin. attrs also has a nicer API for actually getting the fields, so the equivalent attrs example would be:

from attrs import fields as f

user_projection = await fetch_projection(User, id=1, f(User).username)

Wouldn't it be super cool if Mypy (or other type checkers) could check this statically?
Typing-sig mailing list -- typing-sig@python.org
To unsubscribe send an email to typing-sig-leave@python.org
Member address: gohanpra@gmail.com

S Pradeep Kumar