frozen dataclasses attribute initialization

When creating frozen dataclasses, attribute initialization must be done using `object.__setattr__()` it would be nice to allow attribute assignment in the `__init__` and `__post_init__`. Currently we have to do this : ```python @dataclasses.dataclass(frozen=True) class Person: name: str surname: str fullname: str = dataclasses.field(init=False) def __post_init__(self): object.__setattr__(self, "fullname",f"{self.name} {self.surname}") ``` I think it would be more clean like this: ```python @dataclasses.dataclass(frozen=True) class Person: name: str surname: str fullname: str = dataclasses.field(init=False) def __post_init__(self): self.fullname = f"{self.name} {self.surname}" ```

On Dec 11, 2019, at 08:56, Arthur Pastel <arthur.pastel@gmail.com> wrote:
When creating frozen dataclasses, attribute initialization must be done using `object.__setattr__()` it would be nice to allow attribute assignment in the `__init__` and `__post_init__`.
But how would you implement that? Frozen means that attribute assignment isn’t allowed. The __post_init__ method isn’t magic, it’s just a normal method that gets called normally by the generated __init__. What you’re proposing is that we should make it magic. Which would be useful, but you need to come up with magic that works. If dataclass handled freezeable types (the objects are mutable until you call freeze, after which they’re not), this would be easy (frozen now just means freezable, plus freeze is called by the generated __init__ right after the __post_init__). But it doesn’t, because freezing is complicated. So, what else could you do? Make __setattr__ check the stack and see if it’s being called from type(self).__post_init__? Add an extra hidden attribute to every instance just to track whether you’re inside __post_init__ so __setattr__ can check it?

On 12/11/2019 1:23 PM, Andrew Barnert via Python-ideas wrote:
I agree with Andrew: it's a great idea, but I couldn't come up with a way to implement it, so I followed attrs lead instead. Maybe someone can come up with a good way to implement it: does attrs still do the same object.__setattr__ trick? Eric

It seems like this could be solved if a general way could be created to allow dataclasses to handle descriptors properly? Or if not descriptors proper, at least a dataclass version of @property? I'm sure this has been discussed before. But one idea would be to add an optional "property" argument to `field`, which would turn it into a descriptor. So the code could be something like this, maybe: @dataclasses.dataclass(frozen=True) class Person: name: str surname: str fullname: str = dataclasses.field(property=True) # fullname is now a field descriptor @fullname.getter def fullname_getter(self): return self._fullname @fullname.setter def fullname_setter(self, value): assert value == ???? self.fullname = f"{self.name} {self.surname}" If this is a good idea, one big question is what the default __init__ method should supply to the "value" argument.

bah, I forgot the underscore in the setter. Corrected for clarity: @dataclasses.dataclass(frozen=True) class Person: name: str surname: str fullname: str = dataclasses.field(property=True) # fullname is now a field descriptor @fullname.getter def fullname_getter(self): return self._fullname @fullname.setter def fullname_setter(self, value): assert value == ???? self._fullname = f"{self.name} {self.surname}" --- Ricky. "I've never met a Kentucky man who wasn't either thinking about going home or actually going home." - Happy Chandler On Wed, Dec 11, 2019 at 6:08 PM Ricky Teachey <ricky@teachey.org> wrote:

On Dec 11, 2019, at 15:10, Ricky Teachey <ricky@teachey.org> wrote:
Well, you could always do this: assert value == f”{self.name} {self.surname}” self._fullname = value Now, the property is “frozen” in that you can legally mutate it to what it already is, but not to anything else. But that’s pretty weird semantics. (And the fact that breaking the rule means an assertion error rather than the usual one is even weirder, but that part is easy to fix.) I think what you’re really looking for here is the C++ feature of being able to declare a attribute as mutable even in const instances. Then you could use it to cache your fullname just by assigning to the always-mutable but private _fullname attribute, and have a readonly fullname property that uses that cache.

On Thu, Dec 12, 2019 at 2:03 AM Eric V. Smith <eric@trueblade.com> wrote:
Would it be okay if we create an instance attribute before calling the __init__ and then we remove it after the __post_init__ ? Then we would simply have to check for the attribute existence when checking if we raise a FrozenInstanceError.

Arg, okay i didn't know. Then would it be possible to return a child class in the _process_class function ? It would then be possible to modify the slots. I'm trying to find a way around this but maybe this is not the good approach. On Thu, Dec 12, 2019 at 2:30 PM Eric V. Smith <eric@trueblade.com> wrote:

On 12/12/2019 8:50 AM, Arthur Pastel wrote:
Yes, it is possible. In my github repo I have an add_slots decorator that does just that. But I don't want the stdlib dataclass decorator to start returning a different class, without some serious discussion and some upsides. I don't think making __post_init__ prettier for frozen classes meets that bar. Although if there's overwhelming support for it here, I could be convinced to change my mind. Eric

I think the simpler route to go there (allow the use of `self.attr=` until after ___post_init__ is run is simply to add another flag attribute, that tells wether "initialization is over", and respect that flag in the added `__setattr__`. The hook calling `__post_init__` would then set that flag after it is done. A 5 line change is enough to make this work for __post_init__: ``` diff --git a/Lib/dataclasses.py b/Lib/dataclasses.py index 91c1f6f80f..65d51ca0a9 100644 --- a/Lib/dataclasses.py +++ b/Lib/dataclasses.py @@ -194,6 +194,11 @@ _PARAMS = '__dataclass_params__' # __init__. _POST_INIT_NAME = '__post_init__' + +# Flag used to allow field initialization in `__post_init__` +# and `__init__` for frozen classes +_FROZEN_INITIALIZATION_FLAG = '__dataclass_initialized__' + # String regex that string annotations for ClassVar or InitVar must match. # Allows "identifier.identifier[" or "identifier[". # https://bugs.python.org/issue33453 for details. @@ -517,6 +522,9 @@ def _init_fn(fields, frozen, has_post_init, self_name): if f._field_type is _FIELD_INITVAR) body_lines.append(f'{self_name}.{_POST_INIT_NAME}({params_str})') + if frozen: + body_lines.append(f'{self_name}.{_FROZEN_INITIALIZATION_FLAG} = True') + # If no body lines, use 'pass'. if not body_lines: body_lines = ['pass'] @@ -552,13 +560,13 @@ def _frozen_get_del_attr(cls, fields): fields_str = '()' return (_create_fn('__setattr__', ('self', 'name', 'value'), - (f'if type(self) is cls or name in {fields_str}:', + (f'if (type(self) is cls or name in {fields_str}) and getattr(self, "{_FROZEN_INITIALIZATION_F LAG}", False):', ' raise FrozenInstanceError(f"cannot assign to field {name!r}")', f'super(cls, self).__setattr__(name, value)'), globals=globals), _create_fn('__delattr__', ('self', 'name'), - (f'if type(self) is cls or name in {fields_str}:', + (f'if (type(self) is cls or name in {fields_str}) and getattr(self, "{_FROZEN_INITIALIZATION_F LAG}", False):', ' raise FrozenInstanceError(f"cannot delete field {name!r}")', f'super(cls, self).__delattr__(name)'), globals=globals), (END) ``` To get it working for `__init__` as well, however is a bit more complicated - as we don't have control of the dataclass metaclass, the only way to put a hook to set the flag after __init__ is run is to apply a decorator in the existing __init__ that would do it. However, this might be a way to have this feature _IF_ people think it would be worth it - (I personally would prefer yet another parameter to allow changing fields during frozen initialization, as that is, for me, an exceptional way of doing it less simple than having to call `object.__setattr__`) On Fri, 13 Dec 2019 at 07:22, Arthur Pastel <arthur.pastel@gmail.com> wrote:

We discussed this just before and having an extra instance attribute was quite problematic. That's why I suggested to have an attribute set in the beginning of the __init__ and deleted after the initialization is complete. Anyways in both cases, there is still a problem when the class uses __slots__as @Eric V. Smith <eric@trueblade.com> mentioned previously. On Fri, Dec 13, 2019 at 2:50 PM Joao S. O. Bueno <jsbueno@python.org.br> wrote:

On Dec 11, 2019, at 15:40, Arthur Pastel <arthur.pastel@gmail.com> wrote:
If you don’t understand freezable, I guess you’re not going to propose it as an answer, so don’t worry about it?
I guess, but not a good one. Stack inspection is generally considered a huge code smell. For example, whenever someone on this list says “Python should add X”, and someone replies “you can already do X if you’re willing to do some hacky stack inspection”, nobody takes that as an argument that Python doesn’t need to add X. Also, stack inspection isn’t even guaranteed to work in other Python implementations, so someone would have to come up with their own different way of implementing this feature for such implementations anyway.
Better, but still not good. People care how big their objects are, especially when they’re using slots or namedtuple or dataclass. If most of my program’s memory use is zillions of instances of tiny data classes, and Python 3.10 makes every one of them 8 bytes bigger so my total memory use goes up 32%, I’m not going to be happy. All of my implementations are terrible ideas; arguing for one of them is unlikely to get you very far. You need to come up with something that isn’t terrible instead.
On top of this, to avoid having the extra hidden attribute maybe it would be possible to dynamically define the __setattr__ method in the end of the __post_init__ call.
__setattr__, like most special methods, has to appear on the class, not the instance. So you can’t dynamically redefine it. (Plus, even if you could, it would again add 8 bytes to every instance.)
participants (6)
-
Andrew Barnert
-
Arthur Pastel
-
Barry
-
Eric V. Smith
-
Joao S. O. Bueno
-
Ricky Teachey