I'm just curious:  what is the downside of calling super with kwargs?  Usually when I define a class, the first I write is

def __init__(self, **kwargs):

just in case I want to use the class in cooperative inheritance.  I always thought it couldn't hurt?

I might be alone in this interpretation, but I imagine that there are three fundamental kinds of inheritance patterns for methods, which I defined in my ipromise package (https://pypi.org/project/ipromise/): implementing an abstract method, overriding, and augmenting.  If I had to choose, I would say that __init__ should be an "augmenting" pattern.

Therefore, it seems weird for __init__ not to call super.  Even if you want Y t to override some behavior in X, what happens if Z inherits from Y and W?  Now, Y.__init__'s decision not to call super would mean that W.__init__ would not be called.  That seems like a bug.  Instead, I would rather put the behavior that Y wants to override in a separate method, say X.f, which is called in X.__init__.  Now, if Y.f overrides X.f, everything is okay.  Even if Z inherits from Y and W, the override still works, and W.__init__ still gets called, angels sing, etc.

Long story short, am I wrong to interpret __init__ as a "must augment" method and always call super().__init__(**kwargs)?

On Wed, Apr 15, 2020 at 1:32 PM Andrew Barnert <abarnert@yahoo.com> wrote:
On Apr 15, 2020, at 04:26, Ricky Teachey <ricky@teachey.org> wrote:

For simple situations you can call super in the __post_init__ method and things will work fine:

But not for the OP's case: he wanted to pass extra parameters in -- and the dataclass' __init__ won't accept extra arguments.

Can’t you just create InitVar attributes for the extra args you want to pass through in that case?

InitVar fields for all the desired parent class init parameters can often solve the problem.

But it can be painful to have to manually provide every parameter explicitly when normally (when not using a dataclass) you'd just add *args and **kwargs to the init signature and call super().__init__(*args, **kwargs).

To handle that case, couldn’t we just add InitVarStar and InitVarStarStar fields? If present, any extra positional and/or keyword args get captured for the __postinit__ to pass along just like normal InitVars.

I think that could definitely be useful, but not as useful as you seem to think it would be.

It becomes more painful the more parameters the parent has- parameters which the dataclass may not even care about. It not only makes the class definition long, it adds so these additional parameters to the init signature, which is icky for introspection and discoverability. Lots of "What the heck is this parameter doing here?" head scratching for future me (because I forget everything).

I think that’s backward. The signature is there for the user of the dataclass, not the implementer. And the user had better care about that x argument, because it’s a mandatory parameter of the X class, so if they don’t pass one, they’re going to get an exception from inside some class they never heard of. So having x show up in the signature would be helpful for introspection and discovery, not harmful. It makes your users ask “What the heck is the x parameter doing here?” but that’s a question that they’d better have an answer to or they can’t construct a Y instance. (And notice that the X doesn’t take or pass along *args, so if the Y claims to take *args as well as **kwargs, that’s even more misleading, because passing any extra positional args to the constructor will also raise.) And that’s as true for tools as for human readers—an IDE auto-completing the parameters of Y(…) should be prompting you for an x; a static analyzer should be catching that you forgot to pass as x; etc.

There are cases where you need *args and/or **kwargs in a constructor. But I don’t think they make sense for a dataclsss. For example, you’re not going to write a functools.partial replacement or a generic RPC proxy object as a dataclass.

But there are cases where you _want_ them, just out of… call it enlightened laziness. And those cases do seem to apply to dataclass at least as much as normal classes. And that’s why I think it could be worth having these new fields. The OP’s toy example looks like part if a design where the X is one of a bag of mixins from some library that you compose up however you want in your app.

You got it.
It would be more discoverable and readable if the final composed Y class knew which parameters its mixins demanded—but it may not be worth the effort to put that together (either manually, or programmatically at class def time). If your Y class is being exposed as part of a library (or RPC or bridge or whatever), or it forms part of the connection between two key components in an air traffic control system, then you probably do want to put in that effort. If it’s an internal class that only gets constructed in one place, in a tool for trolling each other with bad music in the office stereo that only you and three colleagues will ever run, why do the extra work to get earlier checking that you don’t need? The fact that Python leaves that kind of choice up to you to decide (because you’re the only one who knows), and so do most pythonic libraries like the one you got that X mixin out of… that’s a big part of why you wrote that script in Python in the first place. And if dataclasses get in the way of that, it’s a problem, and probably worth fixing.