Concerns about method overriding and subclassing with dataclasses

Hello all, I've recently been experimenting with dataclasses. They totally rock! A lot of the boilerplate for the AST I've designed in Python is automatically taken care of, it's really great! However, I have a few concerns about the implementation. In a few cases I want to override the repr of the AST nodes. I wrote a __repr__ and ran the code but lo and behold I got a type error. I couldn't override it. I quickly learned that one needs to pass a keyword to the dataclass decorator to tell it *not* to auto generate methods you override. I have two usability concerns with the current implementation. I emailed Eric about the first, and he said I should ask for thoughts here. The second I found after a couple of days sitting on this message. The first is that needing both a keyword and method is duplicative and unnecessary. Eric agreed it was a hassle, but felt it was justified considering someone may accidentally override a dataclass method. I disagree with this point of view as dataclasses are billed as providing automatic methods. Overriding via method definition is very natural and idiomatic. I don't really see how someone could accidentally override a dataclass method if methods were not generated by the dataclass decorator that are already defined in the class at definition time. The second concern, which I came across more recently, is if I have a base class, and dataclasses inherit from this base class, inherited __repr__ & co are silently overridden by dataclass. This is both unexpected, and also means I need to pass a repr=False to each subclass' decorator to get correct behavior, which somewhat defeats the utility of subclassing. Im not as sure a whole lot can be done about this though. I appreciate any thoughts folks have related to this. Cheers, ~>Ethan Smith

On Fri, 29 Dec 2017 02:23:56 -0800 Ethan Smith <ethan@ethanhs.me> wrote:
Agreed. We shouldn't take magic too far just for the sake of protecting users against their own (alleged) mistakes. And I'm not sure how you "accidentally" override a dataclass method (if I'm implementing a __repr__ I'm doing so deliberately :-)).
Agreed as well. If I make the effort of having a dataclass inherit from a base class, I probably don't want the base class' methods to be silently overriden by machine-generated methods. Of course, that can be worked around by using multiple inheritance, you just need to be careful and add a small amount of class definition boilerplate. I would expect dataclass parameters such as `repr` to be tri-state: * repr=None (the default): only provide a machine-generated implementation if none is already defined (either on a base class or in the dataclass namespace... ignoring runtime-provided defaults such as object.__repr__) * repr=False: never provide a machine-generated implementation * repr=True: always provide a machine-generated implementation, even overriding a previous user-defined implementation Regards Antoine.

On Fri, Dec 29, 2017 at 2:45 AM, Antoine Pitrou <solipsis@pitrou.net> wrote:
My thinking exactly.
I am not sure exactly what you mean by "worked around by using multiple inheritance". Do you mean you think the dataclass decorator should be made into a dataclass base class for a dataclass class, or that it should look at the MRO?
This is sensible to me. Thanks for your thoughts! ~>Ethan Smith

On Fri, 29 Dec 2017 11:12:11 -0800 Ethan Smith <ethan@ethanhs.me> wrote:
I mean you can write: class _BaseClass: def __repr__(self): # ... @dataclass class _DataclassMixin: # your attribute definitions here class FinalClass(_BaseClass, _BaseDataclass): pass Yes, it's tedious and verbose :-) Regards Antoine.

On 12/29/2017 02:23 AM, Ethan Smith wrote:
Accidental or not, the decorator should not be replacing methods defined by the class.
It is possible to determine whether an existing __repr__ is from 'object' or not, and only provide one if that is the case. I think that should be the default, with 'repr = True' for those cases where a new, auto-generated, __repr__ is desired. -- ~Ethan~

On Fri, Dec 29, 2017 at 11:37 AM, Ethan Furman <ethan@stoneleaf.us> wrote:
The only other thing you'd want to handle is to cover inheriting from another dataclass. e.g., if I have dataclass with attribute a, and subclass it to add attribute b, I'd want both in the repr. ~>Ethan Smith

On 12/29/2017 11:55 AM, Ethan Smith wrote:
On Fri, Dec 29, 2017 at 11:37 AM, Ethan Furman wrote:
Good point. So auto-generate a new __repr__ if: - one is not provided, and - existing __repr__ is either: - object.__repr__, or - a previous dataclass __repr__ And if the auto default doesn't work for one's use-case, use the keyword parameter to specify what you want. -- ~Ethan~

On Fri, Dec 29, 2017 at 12:30 PM, Ethan Furman <ethan@stoneleaf.us> wrote:
What does attrs do here? -n -- Nathaniel J. Smith -- https://vorpus.org

attrs just silently overwrites any user provided __repr__ unless you provide repr=False to attr.s. I think we can all agree that if nothing else, silently overwriting unconditionally is not what we want for dataclasses. On Fri, Dec 29, 2017 at 4:38 PM, Nathaniel Smith <njs@pobox.com> wrote:

On Fri, Dec 29, 2017 at 4:52 PM, Guido van Rossum <guido@python.org> wrote:
I still think it should overrides anything that's just inherited but nothing that's defined in the class being decorated.
Could you explain why you are of this opinion? Is it a concern about complexity of implementation?

No, I am concerned about the rule being too complex to explain, and about surprising effects when the base changes (action at a distance). I also don't necessarily think "we all agree" that what attrs does is wrong, but the rule I propose seems reasonable. On Dec 29, 2017 5:58 PM, "Ethan Smith" <ethan@ethanhs.me> wrote:

On Fri, Dec 29, 2017 at 12:30 PM, Ethan Furman wrote:
On Dec 29, 2017 5:43 PM, "Nathaniel Smith" wrote:
What does attrs do here?
On 12/29/2017 04:52 PM, Ethan Smith wrote:
attrs just silently overwrites any user provided __repr__ unless you provide repr=False to attr.s.
On 12/29/2017 04:52 PM, Guido van Rossum wrote:
I still think it should overrides anything that's just inherited but nothing that's defined in the class being decorated.
I can certainly live with that. -- ~Ethan~

On Dec 29, 2017, at 4:52 PM, Guido van Rossum <guido@python.org> wrote:
I still think it should overrides anything that's just inherited but nothing that's defined in the class being decorated.
This has the virtue of being easy to explain, and it will help with debugging by honoring the code proximate to the decorator :-) For what it is worth, the functools.total_ordering class decorator does something similar -- though not exactly the same. A root comparison method is considered user-specified if it is different than the default method provided by object: def total_ordering(cls): """Class decorator that fills in missing ordering methods""" # Find user-defined comparisons (not those inherited from object). roots = {op for op in _convert if getattr(cls, op, None) is not getattr(object, op, None)} ... The @dataclass decorator has a much broader mandate and we have almost no experience with it, so it is hard to know what legitimate use cases will arise. Raymond

Ethan Furman writes:
-0.5 I'm with Guido here. Just use the simple rule that a new __repr__ is generated unless provided in the dataclass. The logic I use (Guido seems to be just arguing for "simple" for now) is that a dataclass is "usually" going to add fields, which you "normally" want exposed in the repr, and that means that an *inherited* __repr__ is going to be broken in some sense. The code author will disagree in "a few" cases, and in those cases they will use repr=False to override. I grant that there may be many reasons why one would be deriving dataclasses from dataclasses without adding fields that should be in the repr, so the quote marks above may be taken to be indications of my lack of imagination. ;-) Here's to 2018. It *has* to be better than 2017 -- there will be a Python feature release! Steve

On Sat, Dec 30, 2017 at 7:27 AM, Stephen J. Turnbull < turnbull.stephen.fw@u.tsukuba.ac.jp> wrote:
Just use the simple rule that a new __repr__ is generated unless provided in the dataclass.
are we only talking about __repr__ here ??? I interpretted Guido's proposal as being about all methods -- we _may_ want something special for __repr__, but I hope not. But +1 for Guido's proposal, not only because it's easy to explain, but because it more naturally follows the usual inheritance logic: The decorator's entire point is to auto-generate boilerplate code for you. Once it's done that it shouldn't, in the end, behave any differently than if you hand wrote that code. If you hand wrote the methods that the decorator creates for you, they would override any base class versions. So that's what it should do. And the fact that you can optionally tell it not to in some particular case keeps full flexibility. -CHB
I grant that there may be many reasons why one would be deriving
dataclasses from dataclasses
Will you get the "right" __repr__ now if you derive a datacalss from a dataclass? That would be a nice feature. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On Mon, Jan 1, 2018 at 5:03 PM, Chris Barker <chris.barker@noaa.gov> wrote:
I interpreted this to be for all methods as well, which makes sense. Special casing just __repr__ doesn't make sense to me, but I will wait for Guido to clarify.
The __repr__ will be generated by the child dataclass unless the user overrides it. So I believe this is the "right" __repr__. ~>Ethan Smith

On Mon, Jan 1, 2018 at 7:50 PM, Ethan Smith <ethan@ethanhs.me> wrote:
what I was wondering is if the child will know about all the fields in the parent -- so it could make a full __repr__. -CHB
-- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On Mon, Jan 1, 2018 at 8:50 PM, Ethan Smith <ethan@ethanhs.me> wrote:
Indeed, I just wrote __repr__ for simplicity. This should apply to all special methods. (Though there may be some complications for __eq__/__ne__ and for the ordering operators.) On Mon, Jan 1, 2018 at 9:44 PM, Chris Barker <chris.barker@noaa.gov> wrote:
Yes, there's a class variable (__dataclass_fields__) that identifies the parent fields. The PEP doesn't mention this or the fact that special methods (like __repr__ and __init__) can tell whether a base class is a dataclass. It probably should though. (@Eric) -- --Guido van Rossum (python.org/~guido)

On 1/2/2018 12:01 AM, Guido van Rossum wrote:
I think that's covered in this section: https://www.python.org/dev/peps/pep-0557/#inheritance Eric.

On Fri, Jan 5, 2018 at 5:08 AM, Eric V. Smith <eric@trueblade.com> wrote:
I was specifically talking about the name and contents of __dataclass_fields__, which are not documented by the PEP. I expect it's inevitable that people will be looking at this (since they can see it in the source code). Or do you recommend that people use dataclasses.fields() and catch ValueError? I notice that _isdataclass() exists but is private and I don't recall why. (Also now I'm curious what the "pseudo-fields" are that fields() ignores, but that's OT.) -- --Guido van Rossum (python.org/~guido)

On 1/5/2018 11:24 AM, Guido van Rossum wrote:
The expectation is to use dataclasses.fields(). Both it and __dataclass_fields__ contain the fields for this class and the parents. The only difference is the pseudo-fields. I can add some words describing .fields() returning which fields are present.
I notice that _isdataclass() exists but is private and I don't recall why.
I think the argument was that it's an anti-pattern, and if you really want to know, just call dataclasses.fields() and catch the TypeError. I have this in a helper file: def isdataclass(obj): """Returns True for dataclass classes and instances.""" try: dataclasses.fields(obj) return True except TypeError: return False (Also now I'm curious what
the "pseudo-fields" are that fields() ignores, but that's OT.)
ClassVar and InitVar "fields". dataclasses.fields() doesn't return them. Eric.

Hm. I don't know that people will conclude that checking for a dataclass is an anti-pattern. They'll probably just invent a myriad of different hacks like the one you showed. I recommend making it public. I still worry a bit about ClassVar and InitVar being potentially useful but I concede I have no use case so I'll drop it. On Fri, Jan 5, 2018 at 8:43 AM, Eric V. Smith <eric@trueblade.com> wrote:
-- --Guido van Rossum (python.org/~guido)

On 1/5/2018 12:58 PM, Guido van Rossum wrote:
I'm trying to track down the original discussion. We got bogged down on whether it worked for classes or instances or both, then we got tied up in naming it (surprise!), then it looks like we decided to just not include it since you could make those decisions for yourself. I think the discussion is buried in this thread: https://mail.python.org/pipermail/python-dev/2017-November/150966.html Which references: https://github.com/ericvsmith/dataclasses/issues/99 So, ignoring the naming issue, I think if we want to revive it, the question is: should isdataclass() return True on just instances, just classes, or both? And should it ever raise an exception, or just return False?
I still worry a bit about ClassVar and InitVar being potentially useful but I concede I have no use case so I'll drop it.
IIRC, we decided that we could add a parameter to dataclasses.fields() if we ever wanted to return pseudo-fields. But no one came up with a use case. Eric.

I'm normally no big fan of things that take either a class or an instance, but since fields() does this, I think is_dataclass() should to. And that's the name I'd choose. OK on the pseudo-fields. On Fri, Jan 5, 2018 at 11:06 AM, Eric V. Smith <eric@trueblade.com> wrote:
-- --Guido van Rossum (python.org/~guido)

On 1/5/2018 2:11 PM, Eric V. Smith wrote:
https://bugs.python.org/issue32499 I'm slowly reading through the rest of this thread and will respond this weekend. Eric.

Chris Barker writes:
are we only talking about __repr__ here ???
I am, because I haven't thought about the other methods, except to note I find it hard to imagine a use case *for me* that would require any of them. That sorta disqualifies me from comment. ;-) I assumed others were talking about all of the dataclass autogenerated methods, though.
And the fact that you can optionally tell it not to in some particular case keeps full flexibility.
AFAICS there is no question about that, just about *how* you indicate that you do or don't want autobogotification.
Will you get the "right" __repr__ now if you derive a datacalss from a dataclass? That would be a nice feature.
I hadn't thought about that: I wouldn't call it "nice", I'd say it's a sine-qua-back-to-the-drawing-board.

I like the Guido's proposal, i.e. if '__repr__' not in cls.__dict__: ... # generate the method etc. I didn't find an issue to track this. Maybe we should open one? -- Ivan

On 1/3/2018 1:17 PM, Eric V. Smith wrote:
I’ll open an issue after I have time to read this thread and comment on it.
https://bugs.python.org/issue32513 I need to think though how __eq__ and __ne__ work, as well as the ordering operators. My specific concern with __ne__ is that there's one flag to control their generation, but python will use "not __eq__" if you don't provide __ne__. I need to think through what happens if the user only provides __eq__: does dataclasses do nothing, does it add __ne__, and how does this interact with a base class that does provide __ne__. Eric.

On Sun, Jan 7, 2018 at 9:09 AM, Eric V. Smith <eric@trueblade.com> wrote:
Maybe dataclasses should only ever provide __eq__ and always assume Python's default for __ne__ kicks in? If that's not acceptable (maybe there are cases where a user did write an explicit __ne__ that needs to be overridden) I would recommend the following rule: - If there's an __eq__, don't do anything (regardless of whether there's an __ne__) - If there no __eq__ but there is an __ne__, generate __eq__ but don't generate __ne__ - If neither exists, generate both -- --Guido van Rossum (python.org/~guido)

On 1/7/2018 12:25 PM, Guido van Rossum wrote:
I've added my proposal on issue 32513: https://bugs.python.org/issue32513#msg310392 It's long, so I won't repeat it here. The only really confusing part is __hash__ and its interaction with __eq__. Eric.

On Fri, 29 Dec 2017 02:23:56 -0800 Ethan Smith <ethan@ethanhs.me> wrote:
Agreed. We shouldn't take magic too far just for the sake of protecting users against their own (alleged) mistakes. And I'm not sure how you "accidentally" override a dataclass method (if I'm implementing a __repr__ I'm doing so deliberately :-)).
Agreed as well. If I make the effort of having a dataclass inherit from a base class, I probably don't want the base class' methods to be silently overriden by machine-generated methods. Of course, that can be worked around by using multiple inheritance, you just need to be careful and add a small amount of class definition boilerplate. I would expect dataclass parameters such as `repr` to be tri-state: * repr=None (the default): only provide a machine-generated implementation if none is already defined (either on a base class or in the dataclass namespace... ignoring runtime-provided defaults such as object.__repr__) * repr=False: never provide a machine-generated implementation * repr=True: always provide a machine-generated implementation, even overriding a previous user-defined implementation Regards Antoine.

On Fri, Dec 29, 2017 at 2:45 AM, Antoine Pitrou <solipsis@pitrou.net> wrote:
My thinking exactly.
I am not sure exactly what you mean by "worked around by using multiple inheritance". Do you mean you think the dataclass decorator should be made into a dataclass base class for a dataclass class, or that it should look at the MRO?
This is sensible to me. Thanks for your thoughts! ~>Ethan Smith

On Fri, 29 Dec 2017 11:12:11 -0800 Ethan Smith <ethan@ethanhs.me> wrote:
I mean you can write: class _BaseClass: def __repr__(self): # ... @dataclass class _DataclassMixin: # your attribute definitions here class FinalClass(_BaseClass, _BaseDataclass): pass Yes, it's tedious and verbose :-) Regards Antoine.

On 12/29/2017 02:23 AM, Ethan Smith wrote:
Accidental or not, the decorator should not be replacing methods defined by the class.
It is possible to determine whether an existing __repr__ is from 'object' or not, and only provide one if that is the case. I think that should be the default, with 'repr = True' for those cases where a new, auto-generated, __repr__ is desired. -- ~Ethan~

On Fri, Dec 29, 2017 at 11:37 AM, Ethan Furman <ethan@stoneleaf.us> wrote:
The only other thing you'd want to handle is to cover inheriting from another dataclass. e.g., if I have dataclass with attribute a, and subclass it to add attribute b, I'd want both in the repr. ~>Ethan Smith

On 12/29/2017 11:55 AM, Ethan Smith wrote:
On Fri, Dec 29, 2017 at 11:37 AM, Ethan Furman wrote:
Good point. So auto-generate a new __repr__ if: - one is not provided, and - existing __repr__ is either: - object.__repr__, or - a previous dataclass __repr__ And if the auto default doesn't work for one's use-case, use the keyword parameter to specify what you want. -- ~Ethan~

On Fri, Dec 29, 2017 at 12:30 PM, Ethan Furman <ethan@stoneleaf.us> wrote:
What does attrs do here? -n -- Nathaniel J. Smith -- https://vorpus.org

attrs just silently overwrites any user provided __repr__ unless you provide repr=False to attr.s. I think we can all agree that if nothing else, silently overwriting unconditionally is not what we want for dataclasses. On Fri, Dec 29, 2017 at 4:38 PM, Nathaniel Smith <njs@pobox.com> wrote:

On Fri, Dec 29, 2017 at 4:52 PM, Guido van Rossum <guido@python.org> wrote:
I still think it should overrides anything that's just inherited but nothing that's defined in the class being decorated.
Could you explain why you are of this opinion? Is it a concern about complexity of implementation?

No, I am concerned about the rule being too complex to explain, and about surprising effects when the base changes (action at a distance). I also don't necessarily think "we all agree" that what attrs does is wrong, but the rule I propose seems reasonable. On Dec 29, 2017 5:58 PM, "Ethan Smith" <ethan@ethanhs.me> wrote:

On Fri, Dec 29, 2017 at 12:30 PM, Ethan Furman wrote:
On Dec 29, 2017 5:43 PM, "Nathaniel Smith" wrote:
What does attrs do here?
On 12/29/2017 04:52 PM, Ethan Smith wrote:
attrs just silently overwrites any user provided __repr__ unless you provide repr=False to attr.s.
On 12/29/2017 04:52 PM, Guido van Rossum wrote:
I still think it should overrides anything that's just inherited but nothing that's defined in the class being decorated.
I can certainly live with that. -- ~Ethan~

On Dec 29, 2017, at 4:52 PM, Guido van Rossum <guido@python.org> wrote:
I still think it should overrides anything that's just inherited but nothing that's defined in the class being decorated.
This has the virtue of being easy to explain, and it will help with debugging by honoring the code proximate to the decorator :-) For what it is worth, the functools.total_ordering class decorator does something similar -- though not exactly the same. A root comparison method is considered user-specified if it is different than the default method provided by object: def total_ordering(cls): """Class decorator that fills in missing ordering methods""" # Find user-defined comparisons (not those inherited from object). roots = {op for op in _convert if getattr(cls, op, None) is not getattr(object, op, None)} ... The @dataclass decorator has a much broader mandate and we have almost no experience with it, so it is hard to know what legitimate use cases will arise. Raymond

Ethan Furman writes:
-0.5 I'm with Guido here. Just use the simple rule that a new __repr__ is generated unless provided in the dataclass. The logic I use (Guido seems to be just arguing for "simple" for now) is that a dataclass is "usually" going to add fields, which you "normally" want exposed in the repr, and that means that an *inherited* __repr__ is going to be broken in some sense. The code author will disagree in "a few" cases, and in those cases they will use repr=False to override. I grant that there may be many reasons why one would be deriving dataclasses from dataclasses without adding fields that should be in the repr, so the quote marks above may be taken to be indications of my lack of imagination. ;-) Here's to 2018. It *has* to be better than 2017 -- there will be a Python feature release! Steve

On Sat, Dec 30, 2017 at 7:27 AM, Stephen J. Turnbull < turnbull.stephen.fw@u.tsukuba.ac.jp> wrote:
Just use the simple rule that a new __repr__ is generated unless provided in the dataclass.
are we only talking about __repr__ here ??? I interpretted Guido's proposal as being about all methods -- we _may_ want something special for __repr__, but I hope not. But +1 for Guido's proposal, not only because it's easy to explain, but because it more naturally follows the usual inheritance logic: The decorator's entire point is to auto-generate boilerplate code for you. Once it's done that it shouldn't, in the end, behave any differently than if you hand wrote that code. If you hand wrote the methods that the decorator creates for you, they would override any base class versions. So that's what it should do. And the fact that you can optionally tell it not to in some particular case keeps full flexibility. -CHB
I grant that there may be many reasons why one would be deriving
dataclasses from dataclasses
Will you get the "right" __repr__ now if you derive a datacalss from a dataclass? That would be a nice feature. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On Mon, Jan 1, 2018 at 5:03 PM, Chris Barker <chris.barker@noaa.gov> wrote:
I interpreted this to be for all methods as well, which makes sense. Special casing just __repr__ doesn't make sense to me, but I will wait for Guido to clarify.
The __repr__ will be generated by the child dataclass unless the user overrides it. So I believe this is the "right" __repr__. ~>Ethan Smith

On Mon, Jan 1, 2018 at 7:50 PM, Ethan Smith <ethan@ethanhs.me> wrote:
what I was wondering is if the child will know about all the fields in the parent -- so it could make a full __repr__. -CHB
-- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On Mon, Jan 1, 2018 at 8:50 PM, Ethan Smith <ethan@ethanhs.me> wrote:
Indeed, I just wrote __repr__ for simplicity. This should apply to all special methods. (Though there may be some complications for __eq__/__ne__ and for the ordering operators.) On Mon, Jan 1, 2018 at 9:44 PM, Chris Barker <chris.barker@noaa.gov> wrote:
Yes, there's a class variable (__dataclass_fields__) that identifies the parent fields. The PEP doesn't mention this or the fact that special methods (like __repr__ and __init__) can tell whether a base class is a dataclass. It probably should though. (@Eric) -- --Guido van Rossum (python.org/~guido)

On 1/2/2018 12:01 AM, Guido van Rossum wrote:
I think that's covered in this section: https://www.python.org/dev/peps/pep-0557/#inheritance Eric.

On Fri, Jan 5, 2018 at 5:08 AM, Eric V. Smith <eric@trueblade.com> wrote:
I was specifically talking about the name and contents of __dataclass_fields__, which are not documented by the PEP. I expect it's inevitable that people will be looking at this (since they can see it in the source code). Or do you recommend that people use dataclasses.fields() and catch ValueError? I notice that _isdataclass() exists but is private and I don't recall why. (Also now I'm curious what the "pseudo-fields" are that fields() ignores, but that's OT.) -- --Guido van Rossum (python.org/~guido)

On 1/5/2018 11:24 AM, Guido van Rossum wrote:
The expectation is to use dataclasses.fields(). Both it and __dataclass_fields__ contain the fields for this class and the parents. The only difference is the pseudo-fields. I can add some words describing .fields() returning which fields are present.
I notice that _isdataclass() exists but is private and I don't recall why.
I think the argument was that it's an anti-pattern, and if you really want to know, just call dataclasses.fields() and catch the TypeError. I have this in a helper file: def isdataclass(obj): """Returns True for dataclass classes and instances.""" try: dataclasses.fields(obj) return True except TypeError: return False (Also now I'm curious what
the "pseudo-fields" are that fields() ignores, but that's OT.)
ClassVar and InitVar "fields". dataclasses.fields() doesn't return them. Eric.

Hm. I don't know that people will conclude that checking for a dataclass is an anti-pattern. They'll probably just invent a myriad of different hacks like the one you showed. I recommend making it public. I still worry a bit about ClassVar and InitVar being potentially useful but I concede I have no use case so I'll drop it. On Fri, Jan 5, 2018 at 8:43 AM, Eric V. Smith <eric@trueblade.com> wrote:
-- --Guido van Rossum (python.org/~guido)

On 1/5/2018 12:58 PM, Guido van Rossum wrote:
I'm trying to track down the original discussion. We got bogged down on whether it worked for classes or instances or both, then we got tied up in naming it (surprise!), then it looks like we decided to just not include it since you could make those decisions for yourself. I think the discussion is buried in this thread: https://mail.python.org/pipermail/python-dev/2017-November/150966.html Which references: https://github.com/ericvsmith/dataclasses/issues/99 So, ignoring the naming issue, I think if we want to revive it, the question is: should isdataclass() return True on just instances, just classes, or both? And should it ever raise an exception, or just return False?
I still worry a bit about ClassVar and InitVar being potentially useful but I concede I have no use case so I'll drop it.
IIRC, we decided that we could add a parameter to dataclasses.fields() if we ever wanted to return pseudo-fields. But no one came up with a use case. Eric.

I'm normally no big fan of things that take either a class or an instance, but since fields() does this, I think is_dataclass() should to. And that's the name I'd choose. OK on the pseudo-fields. On Fri, Jan 5, 2018 at 11:06 AM, Eric V. Smith <eric@trueblade.com> wrote:
-- --Guido van Rossum (python.org/~guido)

On 1/5/2018 2:11 PM, Eric V. Smith wrote:
https://bugs.python.org/issue32499 I'm slowly reading through the rest of this thread and will respond this weekend. Eric.

Chris Barker writes:
are we only talking about __repr__ here ???
I am, because I haven't thought about the other methods, except to note I find it hard to imagine a use case *for me* that would require any of them. That sorta disqualifies me from comment. ;-) I assumed others were talking about all of the dataclass autogenerated methods, though.
And the fact that you can optionally tell it not to in some particular case keeps full flexibility.
AFAICS there is no question about that, just about *how* you indicate that you do or don't want autobogotification.
Will you get the "right" __repr__ now if you derive a datacalss from a dataclass? That would be a nice feature.
I hadn't thought about that: I wouldn't call it "nice", I'd say it's a sine-qua-back-to-the-drawing-board.

I like the Guido's proposal, i.e. if '__repr__' not in cls.__dict__: ... # generate the method etc. I didn't find an issue to track this. Maybe we should open one? -- Ivan

On 1/3/2018 1:17 PM, Eric V. Smith wrote:
I’ll open an issue after I have time to read this thread and comment on it.
https://bugs.python.org/issue32513 I need to think though how __eq__ and __ne__ work, as well as the ordering operators. My specific concern with __ne__ is that there's one flag to control their generation, but python will use "not __eq__" if you don't provide __ne__. I need to think through what happens if the user only provides __eq__: does dataclasses do nothing, does it add __ne__, and how does this interact with a base class that does provide __ne__. Eric.

On Sun, Jan 7, 2018 at 9:09 AM, Eric V. Smith <eric@trueblade.com> wrote:
Maybe dataclasses should only ever provide __eq__ and always assume Python's default for __ne__ kicks in? If that's not acceptable (maybe there are cases where a user did write an explicit __ne__ that needs to be overridden) I would recommend the following rule: - If there's an __eq__, don't do anything (regardless of whether there's an __ne__) - If there no __eq__ but there is an __ne__, generate __eq__ but don't generate __ne__ - If neither exists, generate both -- --Guido van Rossum (python.org/~guido)

On 1/7/2018 12:25 PM, Guido van Rossum wrote:
I've added my proposal on issue 32513: https://bugs.python.org/issue32513#msg310392 It's long, so I won't repeat it here. The only really confusing part is __hash__ and its interaction with __eq__. Eric.
participants (11)
-
Antoine Pitrou
-
Chris Barker
-
Eric V. Smith
-
Ethan Furman
-
Ethan Smith
-
Guido van Rossum
-
Guido van Rossum
-
Ivan Levkivskyi
-
Nathaniel Smith
-
Raymond Hettinger
-
Stephen J. Turnbull