Magic attribute for attribute names retrieving

A lot of libraries use string for attribute names to do some "dynamic" things. A typical example are SQLAlchemy [validators](https://docs.sqlalchemy.org/en/13/orm/mapped_attributes.html#simple-validato...): ```python from sqlalchemy.orm import validates class EmailAddress(Base): __tablename__ = 'address' id = Column(Integer, primary_key=True) email = Column(String) @validates('email') # Here def validate_email(self, key, address): assert '@' in address return address ``` In this example of SQLAlchemy documentation, email validator use `"email"` string in order to associate the validator to the `email` column. However this dynamic things don't play well with all static tools like linters, but especially IDE (for usages finding, navigation, refactoring, etc.), and of course type checkers. This issue could be solved with a "magic attribute" `__attrs__` (name can be discussed), used the following way: ```python @dataclass class Foo: bar: int foo = Foo(0) assert getattr(foo, Foo.__attrs__.bar) == 0 assert getattr(foo, foo.__attrs__.bar) == 0 ``` To make it usable in class declaration, the `__attrs__` symbol should be added to class declaration namespace: ```python class Foo: bar: int @validator(__attrs__.bar) def validate(self): ... ``` No check would be done by `__attrs__`, they are let to linters which would integrate this language feature and check for the presence of the attribute in the class/instance. And that becomes more interesting because type checkers could use this feature to check dynamic attribute retrieving. A special type `Attribute[Owner, T]` could be defined to be used by type checkers such as `getattr` signature become: ``` # default parameter omitted for concision def getattr(obj: Owner, attr: Attribute[Owner, T], /) -> T: ... ``` (of course, `getattr` can still be used with strings, as the relation between `Attribute` and `str` is explained later) It could allow to add typing to function like the following: ```python Key = TypeVar("Key", bound=Hashable) def dict_by_attr(elts: Collection[T], key_attr: Attribute[T, Key]) -> Mapping[Key, T]: return {getattr(elt, key_attr): elt for elt in elts} ``` Concerning the implementation of this feature would be very straightforward, `__attrs__` being defined as a instance of a class: ```python class Attrs: __slots__ = [] def __getattribute__(self, name): return name ``` Thus, `Foo.__attrs__.bar` would be simply equal to `"bar"`; `Attribute` would be a special type, but backed by a `str`. hence there is no need to modify `getattr` implementation or existing code using it. `Attribute` type should then be a kind of `_SpecialForm`, compatible with string by the "relation" `str` <=> `Attribute[Any, Any]` The only modifications in the langage would be to add the `Attrs` class, an `__attrs__` field to `type` and in class definition namespace when it is evaluated. The rest of the work (checking the attribute presence, type checking with `Attribute[Owner, T]`) should be done by external tools (Pycharm, Mypy, etc.) to handle the feature. To sum up, `__attrs__` would be a pseudo-static wrapper to attribute name retrieving in order to benefit of static tools (refactoring, usages finding, navigation, type checking), with a straightforward and backward compatible implementation. And I don't see that as a "niche" feature, because a lot of libraries could actually benefit from it (and SQLAlchemy, Pydantic, etc. are not a small libraries). Joseph

This sounds a lot like this suggestion to add a nameof function/operator: https://mail.python.org/archives/list/python-ideas@python.org/thread/UUFFAI3... On Thu, Sep 17, 2020 at 10:37 PM Joseph Perez <joperez@hotmail.fr> wrote:
A lot of libraries use string for attribute names to do some "dynamic" things. A typical example are SQLAlchemy [validators]( https://docs.sqlalchemy.org/en/13/orm/mapped_attributes.html#simple-validato... ): ```python from sqlalchemy.orm import validates
class EmailAddress(Base): __tablename__ = 'address'
id = Column(Integer, primary_key=True) email = Column(String)
@validates('email') # Here def validate_email(self, key, address): assert '@' in address return address
``` In this example of SQLAlchemy documentation, email validator use `"email"` string in order to associate the validator to the `email` column.
However this dynamic things don't play well with all static tools like linters, but especially IDE (for usages finding, navigation, refactoring, etc.), and of course type checkers.
This issue could be solved with a "magic attribute" `__attrs__` (name can be discussed), used the following way: ```python @dataclass class Foo: bar: int
foo = Foo(0) assert getattr(foo, Foo.__attrs__.bar) == 0 assert getattr(foo, foo.__attrs__.bar) == 0 ``` To make it usable in class declaration, the `__attrs__` symbol should be added to class declaration namespace: ```python class Foo: bar: int @validator(__attrs__.bar) def validate(self): ... ``` No check would be done by `__attrs__`, they are let to linters which would integrate this language feature and check for the presence of the attribute in the class/instance.
And that becomes more interesting because type checkers could use this feature to check dynamic attribute retrieving. A special type `Attribute[Owner, T]` could be defined to be used by type checkers such as `getattr` signature become: ``` # default parameter omitted for concision def getattr(obj: Owner, attr: Attribute[Owner, T], /) -> T: ... ``` (of course, `getattr` can still be used with strings, as the relation between `Attribute` and `str` is explained later)
It could allow to add typing to function like the following: ```python Key = TypeVar("Key", bound=Hashable) def dict_by_attr(elts: Collection[T], key_attr: Attribute[T, Key]) -> Mapping[Key, T]: return {getattr(elt, key_attr): elt for elt in elts} ```
Concerning the implementation of this feature would be very straightforward, `__attrs__` being defined as a instance of a class: ```python class Attrs: __slots__ = [] def __getattribute__(self, name): return name ``` Thus, `Foo.__attrs__.bar` would be simply equal to `"bar"`; `Attribute` would be a special type, but backed by a `str`. hence there is no need to modify `getattr` implementation or existing code using it. `Attribute` type should then be a kind of `_SpecialForm`, compatible with string by the "relation" `str` <=> `Attribute[Any, Any]` The only modifications in the langage would be to add the `Attrs` class, an `__attrs__` field to `type` and in class definition namespace when it is evaluated. The rest of the work (checking the attribute presence, type checking with `Attribute[Owner, T]`) should be done by external tools (Pycharm, Mypy, etc.) to handle the feature.
To sum up, `__attrs__` would be a pseudo-static wrapper to attribute name retrieving in order to benefit of static tools (refactoring, usages finding, navigation, type checking), with a straightforward and backward compatible implementation.
And I don't see that as a "niche" feature, because a lot of libraries could actually benefit from it (and SQLAlchemy, Pydantic, etc. are not a small libraries).
Joseph _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/UMPALR... Code of Conduct: http://python.org/psf/codeofconduct/

Alex Hall wrote:
This sounds a lot like this suggestion to add a nameof function/operator: https://mail.python.org/archives/list/python-ideas@python.org/thread/UUFFAI3...
Indeed, it sounds like the `nameof` operator; I had not heard of this suggestion before your message. However, there is at least two important differences between both suggestions which may matter: - `nameof` should be implemented as a new operator with a dedicated implementation using AST and raising SyntaxError; `__attrs__` is simpler and dynamic (pythonic?), just a `type` object attribute, and cannot fail. (I don't mention implementation using CPython `current_frame` specificity, as it is unavailable in Pypy, IronPython, etc.) - `nameof` is only about variable/attribute name; `__attrs__` has a dual use: attribute name and `getattr` typing (i don't see how the second could be considered with `nameof`) Joseph

On Thu, Sep 17, 2020 at 08:34:26PM -0000, Joseph Perez wrote:
A lot of libraries use string for attribute names to do some "dynamic" things. A typical example are SQLAlchemy [validators](https://docs.sqlalchemy.org/en/13/orm/mapped_attributes.html#simple-validato...): ```python from sqlalchemy.orm import validates
class EmailAddress(Base): __tablename__ = 'address'
Dunder names are reserved for use by the Python interpreter. Your code here is already on very shaky ground.
id = Column(Integer, primary_key=True) email = Column(String)
@validates('email') # Here def validate_email(self, key, address): assert '@' in address return address
``` In this example of SQLAlchemy documentation, email validator use `"email"` string in order to associate the validator to the `email` column.
However this dynamic things don't play well with all static tools like linters, but especially IDE (for usages finding, navigation, refactoring, etc.), and of course type checkers.
This is the trouble with writing "clever" code with large amounts of implicit state and/or dynamicism, it makes it difficult for static tools and the human reader.
This issue could be solved with a "magic attribute" `__attrs__` (name can be discussed), used the following way:
I don't see how this would solve the issue. The problem is that your code is too dynamic for static tools, so you are proposing to make it **even more dynamic**.
```python @dataclass class Foo: bar: int
foo = Foo(0) assert getattr(foo, Foo.__attrs__.bar) == 0 assert getattr(foo, foo.__attrs__.bar) == 0 ```
Presumably these would work too: assert getattr(foo, None.__attrs__.bar) == 0 assert getattr(foo, foo.__attrs__.baz, 999) == 999 Your proposal is to have a magic dunder, spelled '__attrs__` (note the two additional dots) which converts a attribute access into a string: spam.__attrs__.eggs => 'eggs' so you can pass the 'eggs' string to getattr: # Instead of this: getattr(spam, 'eggs') # this is more magical: getattr(spam, spam.__attrs__.eggs) Apart from typing 15 chars to avoid 2 quotation marks, and making a runtime attribute lookup plus method call to avoid a literal, how will this help reduce dynamicism? If linters etc have difficulty dealing with getattr() using a string literal, how will they do better at even more complex, even more dynamic, code?
To make it usable in class declaration, the `__attrs__` symbol should be added to class declaration namespace:
```python class Foo: bar: int @validator(__attrs__.bar) def validate(self): ... ```
So here we have `__attrs__` is not just a dunder attribute, but also a dunder built-in name.
No check would be done by `__attrs__`, they are let to linters which would integrate this language feature and check for the presence of the attribute in the class/instance.
So linters that cannot cope with `@validate('email')` will be able to cope with `@validate(__attrs__.email)`, because ... why? I don't think that static tools have trouble with dynamic code because they cannot parse a string literal like `'email'`. Removing that literal with more dynamic code like `__attrs__.email` is not going to solve the problem. [...]
To sum up, `__attrs__` would be a pseudo-static wrapper to attribute name retrieving in order to benefit of static tools (refactoring, usages finding, navigation, type checking), with a straightforward and backward compatible implementation.
You have made a lot of claims that this will help static tools. I do not believe it will help. I do not expect that static tools that cannot analyse dynamic code when given a string literal in a dynamic context will magically start working when you replace that string literal with an extra level of dynamic code. To me, this just adds one more step to any static tool: any time the tool, or the human reader, sees __attrs__.spam mentally erase the "__attrs__." and quote the remaining: 'spam' and then proceed as normal from that point. Likewise for attributes: foo.__attrs__.spam If `foo` exists, erase "foo.__attrs__." and quote the remaining part, then proceed as normal. Otherwise raise NameError. Essentially this `__attrs__` is a verbose way of spelling string literals, and can be removed with a pretty simple text substitution. -- Steve

On Fri, Sep 18, 2020 at 10:05:07AM +1000, Steven D'Aprano wrote:
Your proposal is to have a magic dunder, spelled '__attrs__` (note the two additional dots) which converts a attribute access into a string:
Oops, the comment about the dots was from an earlier thought which I deleted. Sorry. -- Steve

Dunder names are reserved for use by the Python interpreter.
That's convenient, because my suggestion is to add a magic attribute into the Python specifications ;) It would be a `type` attribute, like `__mro__` or others.
So here we have `__attrs__` is not just a dunder attribute, but also a dunder built-in name.
It hasn't to be a built-in name in global scope but only in the scope of class declaration (and could then be interpreted by type checkers has being owned by the class being declared). Using it outside a class declaration should raise a `NameError`. By the way, I think you did not understand my suggestion, so maybe it was not enough clear — that's maybe related too to the fact you precise in an other post that your are "not aware of many refactoring tools for Python at all". So I will develop my point by talking about an other pseudo-static (but completely dynamic under the hood) standard Python feature; let's talk about dataclasses. Dataclasses are a dynamic thing, a decorator which rewrite class methods; this is so dynamic that the method code is first written as a string and then compiled with `exec`, and that it use regex parsing under the hood to check types in stringified annotations. That's under the hood. However, for the user, thanks to IDEs, type checkers, linters, etc. this is handled as a static feature. Attribute access can be type checked, and `__init__` method parameters are checked too. Don't believe that IDEs and other tools execute the `dataclass` code for each class in order to retrieve the signature of the generated method. No, they just compute it themselves statically, because they **know that it will be generated this way**. How do they know ? Because it's written in Python specifications, so they have to handle it. And if you write code reassigning an attribute of a frozen dataclass, linter will warn you, while the real check is dynamic and realized in class generated `__setattr__`. (To prove you that's purely static, try to do `dataclass2 = dataclass` and execute mypy with a class decorated by `dataclass2`) So, if tools are able to statically compute a signature of dataclass `__init__` by using its fields annotations (even if it's dynamic like hell behind), I hope you begin to understand how they could be able to interpret `__attrs__.bar` in a different manner than raw `"bar"`; it's not "one more step" as you said, it's a different processing, and because it's a different processing, type checking can be added, using for example my suggested `_SpecialForm` `Attribute[Owner, T]`
So linters that cannot cope with `@validate('email')` will be able to cope with `@validate(__attrs__.email)`, because ... why?
Because it would be written in Python specification that this has a special meaning for type checking, as it is for dataclasses methods. That's why I dare use a dunder name, because it's a suggestion for the Python specification.
Presumably these would work too: `assert getattr(foo, None.__attrs__.bar) == 0` `assert getattr(foo, foo.__attrs__.baz, 999) == 999`
It will works, yes, but the same way that passing a `str` to a function annotated with an `int` parameter: type checker will shout at you. Python static checks are never enforced dynamically, this suggestion does not aim to change this behavior. But in Python, dynamics things can be interpreted in a static way by tools, they needs a specification to follow, dataclasses being again the best example.
Apart from typing 15 chars to avoid 2 quotation marks, and making a runtime attribute lookup plus method call to avoid a literal, how will this help reduce dynamicism?
I hope that you've understood now with this complementary explanation (and yes, there is more chars typed, but is that an issue when meanings are different?)
participants (3)
-
Alex Hall
-
Joseph Perez
-
Steven D'Aprano