[I'm sort of loose with the terms field, parameter, and argument here. Forgive me: I think it's still understandable. Also I'm not specifying types here, I'm using Any everywhere. Use your imagination and substitute real types if it helps you.] There have been many requests to add keyword-only fields to dataclasses. These fields would result in __init__ parameters that are keyword-only. As long as I'm doing this, I'd like to add positional-only fields as well. Basically, I want to add a flag, to each field, stating whether the field results in a normal parameter, a positional-only parameter, or a keyword-only parameter to __init__. Then when I'm generating __init__, I'll examine those flags and put the positional-only ones first, followed by the normal ones, followed by the keyword-only ones. The trick becomes: how do you specify what type of parameter each field represents? First, here's what attrs does. There's a parameter to their attr.ib() function (the moral equivalent of dataclasses.field()) named kw_only, which if set, marks the field as being keyword-only. From https://www.attrs.org/en/stable/examples.html#keyword-only-attributes :
@attr.s ... class A: ... a = attr.ib(kw_only=True) A() Traceback (most recent call last): ... TypeError: A() missing 1 required keyword-only argument: 'a' A(a=1) A(a=1)
There's also a parameter to attr.s, also named kw_only, which if true marks every field as being keyword-only:
@attr.s(kw_only=True) ... class A: ... a = attr.ib() ... b = attr.ib() A(1, 2) Traceback (most recent call last): ... TypeError: __init__() takes 1 positional argument but 3 were given A(a=1, b=2) A(a=1, b=2)
In dataclasses, these example become:
@dataclasses.dataclass ... class A: ... a: Any = field(kw_only=True)
@dataclasses.dataclass(kw_only=True) ... class A: ... a: Any ... b: Any
Aside from the name 'kw_only', which we can bikeshed about, I think these features are good, and I'd like to implement them as shown here. But, I'd like to do two other things: make it easier to use, and support positional-only fields. Since the second one is easier, let's tackle it first. I'd do the same thing as kw_only, but name it something like pos_only. Again, we can argue about the name. Like kw_only, you can either specify individual fields as positional-only, or declare that every field is positional-only. It would be an error to specify both kw_only and pos_only. As far as making it simpler: I dislike needing to use field(kw_only=True), although it would certainly work. The problem is that if you have 1 normal parameter, and 10 keyword-only ones, you'd be forced to say: @dataclasses.dataclass class A: a: Any b: Any = field(kw_only=True, default=0) c: Any = field(kw_only=True, default='foo') e: Any = field(kw_only=True, default=0.0) f: Any = field(kw_only=True) g: Any = field(kw_only=True, default=()) h: Any = field(kw_only=True, default='bar') i: Any = field(kw_only=True, default=3+4j) j: Any = field(kw_only=True, default=10) k: Any = field(kw_only=True) That's way too verbose for me. Ideally, I'd like something like this example: @dataclasses.dataclass class A: a: Any # pragma: KW_ONLY b: Any # pragma: POS_ONLY c: Any And then b would become a keyword-only field and c would be positional-only. But we need some way of telling dataclasses.dataclass what's going on, since obviously pragmas are out. I propose the following. I'll add 2 (or 3, keep reading) singletons to the dataclasses module: KW_ONLY and POS_ONLY. When scanning the __attribute__'s that define fields, fields with these types would be ignored, except for assigning the kw_only/pos_only/normal flag to fields declared after these singletons are used. So you'd get: @dataclasses.dataclass class A: a: Any _: dataclasses.KW_ONLY b: Any __: dataclasses.POS_ONLY c: Any This would generate: def __init__(self, c, /, a, *, b): The names of the KW_ONLY and POS_ONLY fields don't matter, since they're discarded. But as you see above, they still need to be unique. I think _ is a fine name, and since KW_ONLY will be used much more than POS_ONLY, '_: dataclasses.KW_ONLY' would be the pythonic way of saying "the following fields are keyword-only". I do think I'll add a third singleton to specify that subsequent fields are "normal" fields, neither keyword-only or positional-only. I don't know that we have a name for such a thing, let's call it NORMAL_ARG here and bikeshed it later. Then you could say: @dataclasses.dataclass class A: a: Any _: dataclasses.KW_ONLY b: Any __: dataclasses.POS_ONLY c: Any ___: dataclasses.NORMAL_ARG d: Any Then a and d are "normal" fields, while b is keyword-only and c is positional-only. This would generate: def __init__(self, c, /, a, d, *, b): I normally wouldn't propose adding NORMAL_ARG, but since the order of fields matters (for repr, comparisons, etc.) I figure it might be desirable to have their order there be different from the order they're declared. I could be talked out of NORMAL_ARG and they'd just always have to go first (although you could play games with inheritance to change that). My "complex" example above would become: @dataclasses.dataclass class A: a: Any _: dataclasses.KW_ONLY b: Any = 0 c: Any = 'foo' e: Any = 0.0 f: Any g: Any = () h: Any = 'bar' i: Any = 3+4j j: Any = 10 k: Any Which I think is a lot better. There are a few additional quirks involving inheritance, but the behavior would follow naturally from how dataclasses already does inheritance. I can address that later when I work on the docs for this. Remember, the only point of all of these hoops is to add a flag to each field saying what type of __init__ argument it becomes: positional-only, normal, or keyword-only. So, what do you think? Is this a horrible idea? Should it be a PEP, or just a 'simple' feature addition to dataclasses? I'm worried that if I have to do a full blown PEP I won't get to this for 3.10. I should mention another idea that showed up on python-ideas, at https://mail.python.org/archives/list/python-ideas@python.org/message/WBL4X4... . It would allow you to specify the flag via code like: @dataclasses.dataclass class Parent: with dataclasses.positional(): a: int c: bool = False with dataclasses.keyword(): e: list I'm not crazy about it, and it looks like it would require stack inspection to get it to work, but I mention it here for completeness. One last thought: mypy and other type checkers would need to be taught about all of this. -- Eric