dataclasses keyword-only fields, take 2

[I'm sort of loose with the terms field, parameter, and argument here. Forgive me: I think it's still understandable. Also I'm not specifying types here, I'm using Any everywhere. Use your imagination and substitute real types if it helps you.] Here's version 2 of my proposal: There have been many requests to add keyword-only fields to dataclasses. These fields would result in __init__ parameters that are keyword-only. In a previous proposal, I suggested also including positional arguments for dataclasses. That proposal is at https://mail.python.org/archives/list/python-ideas@python.org/message/I3RKK4... . After some discussion, I think it's clear that positional arguments aren't going to work well with dataclasses. The deal breaker for me is that the generated repr would either not work with eval(), or it would contain fields without names (since they're positional). There are additional concerns mentioned in that thread. Accordingly, I'm going to drop positional arguments from this proposal. Basically, I want to add a flag to each field, stating whether the field results in a normal parameter or a keyword-only parameter to __init__. Then when I'm generating __init__, I'll examine those flags and put the normal arguments first, followed by the keyword-only ones. The trick becomes: how do you specify what type of parameter each field represents? What attrs does --------------- First, here's what attrs does. There's a parameter to their attr.ib() function (the moral equivalent of dataclasses.field()) named kw_only, which if set, marks the field as being keyword-only. From https://www.attrs.org/en/stable/examples.html#keyword-only-attributes :
There's also a parameter to attr.s (the equivalent of dataclasses.dataclass), also named kw_only, which if true marks every field as being keyword-only:
dataclasses proposal -------------------- I propose to adopt both of these methods (dataclass(kw_ony=True) and field(kw_only=True) in dataclasses. The above example would become:
But, I'd also like to make this a little easier to use, especially in the case where you're defining a dataclass that has some normal fields and some keyword-only fields. Using the attrs approach, you'd need to declare the keyword-only fields using the "=field(kw_only=True)" syntax, which I think is needlessly verbose, especially when you have many keyword-only fields. The problem is that if you have 1 normal parameter and 10 keyword-only ones, you'd be forced to say: @dataclasses.dataclass class LotsOfFields: a: Any b: Any = field(kw_only=True, default=0) c: Any = field(kw_only=True, default='foo') d: Any = field(kw_only=True) e: Any = field(kw_only=True, default=0.0) f: Any = field(kw_only=True) g: Any = field(kw_only=True, default=()) h: Any = field(kw_only=True, default='bar') i: Any = field(kw_only=True, default=3+4j) j: Any = field(kw_only=True, default=10) k: Any = field(kw_only=True) That's way too verbose for me. Ideally, I'd like something like this example: @dataclasses.dataclass class A: a: Any # pragma: KW_ONLY b: Any And then b would become a keyword-only field, while a is a normal field. But we need some way of telling dataclasses.dataclass what's going on, since obviously pragmas are out. I propose the following. I'll add a singleton to the dataclasses module: KW_ONLY. When scanning the __attribute__'s that define the fields, a field with this type would be ignored, except for assigning the kw_only flag to fields declared after these singletons are used. So you'd get: @dataclasses.dataclass class B: a: Any _: dataclasses.KW_ONLY b: Any This would generate: def __init__(self, a, *, b): This example is equivalent to: @dataclasses.dataclass class B: a: Any b: Any = field(kw_only=True) The name of the KW_ONLY field doesn't matter, since it's discarded. I think _ is a fine name, and '_: dataclasses.KW_ONLY' would be the pythonic way of saying "the following fields are keyword-only". My example above would become: @dataclasses.dataclass class LotsOfFields: a: Any _: dataclasses.KW_ONLY b: Any = 0 c: Any = 'foo' d: Any e: Any = 0.0 f: Any g: Any = () h: Any = 'bar' i: Any = 3+4j j: Any = 10 k: Any Which I think is a lot clearer. The generated __init__ would look like: def __init__(self, a, *, b=0, c='foo', d, e=0.0, f, g=(), h='bar', i=3+4j, j=10, k): The idea is that all normal argument fields would appear first in the class definition, then all keyword argument fields. This is the same requirement as in a function definition. There would be no switching back and forth between the two types of fields: once you use KW_ONLY, all subsequent fields are keyword-only. A field of type KW_ONLY can appear only once in a particular dataclass (but see the discussion below about inheritance). Re-ordering args in __init__ ---------------------------- If, using field(kw_only=True), you specify keyword-only fields before non-keyword-only fields, all of the keyword-only fields will be moved to the end of the __init__ argument list. Within the list of non-keyword-only arguments, all arguments will keep the same relative order as in the class definition. Ditto for within keyword-only arguments. So: @dataclasses.dataclass class C: a: Any b: Any = field(kw_only=True) c: Any d: Any = field(kw_only=True) Then the generated __init__ will look like: def __init__(self, a, c, *, b, d): __init__ is the only place where this rearranging will take place. Everywhere else, and importantly in __repr__ and any dunder comparison methods, the order will be the same as it is now: in field declaration order. This is the same behavior that attrs uses. Inheritance ----------- There are a few additional quirks involving inheritance, but the behavior would follow naturally from how dataclasses already handles fields via inheritance and the __init__ argument re-ordering discussed above. Basically, all fields in a derived class are computed like they are today. Then any __init__ argument re-ordering will take place, as discussed above. Consider: @dataclasses.dataclass(kw_only=True) class D: a: Any @dataclasses.dataclass class E(D): b: Any @dataclasses.dataclass(kw_only=True) class F(E): c: Any This will result in the __init__ signature of: def __init__(self, b, *, a, c): However, the repr() will still produce the fields in order a, b, c. Comparisons will also use the same order. Conclusion ---------- Remember, the only point of all of these hoops is to add a flag to each field saying what type of __init__ argument it becomes: normal or keyword-only. Any of the 3 methods discussed above (kw_only flag to @dataclass(), kw_only flag to field(), or the KW_ONLY marker) all have the same result: setting the kw_only flag on one or more fields. The value of that flag, on a per-field basis, is used to re-order __init__ arguments, and is used in generating the __init__ signature. It's not used anywhere else. I expect the two most common use cases to be the kw_only flag to @dataclass() and the KW_ONLY marker. I would expect the usage of the kw_only flag on field() to be rare, but since it's the underlying mechanism and it's needed for more complex field layouts, it is included in this proposal. So, what do you think? Is this a horrible idea? Should it be a PEP, or just a 'simple' feature addition to dataclasses? I'm worried that if I have to do a full blown PEP I won't get to this for 3.10. mypy and other type checkers would need to be taught about all of this. -- Eric

Good proposal! I have a few questions. On Mon, Mar 15, 2021 at 2:22 PM Eric V. Smith <eric@trueblade.com> wrote:
Can you be specific and show what the repr() would be? E.g. if I create C(1, 2, b=3, d=4) the repr() be C(a=1, b=3, c=2, d=4), right?
This is the same behavior that attrs uses.
Nevertheless I made several typos trying to make the examples in my sentence above correct. Perhaps we could instead disallow mixing kw-only and regular args? Do you know why attrs does it this way?
This can be simulated by flattening the inheritance tree and adding explicit field(kw_only=True) to all fields of classes using kw_only=True in the class decorator as well as all fields affected by _: KW_ONLY, right? So the above would behave like this: @dataclasses.dataclass class F: a: Any = field(kw_only=True) b: Any c: Any = field(kw_only=True) which IIUC indeed gives the same __init__ signature and repr().
I don't think it is very controversial, do you? Then again maybe you should ask a SC member if they would object. mypy and other type checkers would need to be taught about all of this.
Yeah, that's true. But the type checkers have bigger fish to fry (e.g. pattern matching). -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

On 3/15/2021 7:45 PM, Guido van Rossum wrote: this issue. My inclination would be to allow it as I described here, but suggest against it in the documentation.
Correct. I was just trying to show that you can get in to the same mess with alternating normal and kwarg fields (and the subsequent re-ordering of them in __init__) just by using inheritance, and only using dataclass(kw_only=True). That is, if you don't use field(kw_only=True) or KW_ONLY, you still can have the re-ordering issue via inheritance. I should probably show these examples using multiple inheritance of possibly unrelated base classes, instead of showing a chain of single inheritance.
Well, the version with positional arguments seemed pretty controversial! But I agree that this particular version of it is benign and gives many people (including me!) what they want. I'll reach out to some SC members once I get this firmed up.
I think this wouldn't be too difficult for type checkers to get right, I just wanted to note that they're impacted. Eric
-- Eric V. Smith

And now I have a question for you, Guido. I'm looking at the code and I see the additions for __match_args__. Is there any bad interaction between this proposal and the match statement? I assume __match_args__ be the re-ordered arguments to __init__, but I want to make sure. So this: @dataclasses.dataclass class C: a: Any b: Any = field(kw_only=True) c: Any d: Any = field(kw_only=True) Which generates: def __init__(self, a, c, *, b, d): Would have __match_args__ equal to ('a', 'c', 'b', 'd'), right? Even though the repr would have fields in order a, b, c, d. Eric On 3/15/2021 7:45 PM, Guido van Rossum wrote:
-- Eric V. Smith

That's a really good question. I think that `__match_args__` should *only* contain the non-kw-only args, so that you would have to write `case C(a, c, b=b, d=d)` to match on all four attributes (but typically you'd probably just write `case C(a, c)` -- that's about the same as `case C(a, c, b=_, d=_)` except it doesn't even care whether b and d are set at all. On Tue, Mar 16, 2021 at 1:36 PM Eric V. Smith <eric@trueblade.com> wrote:
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

On 3/16/2021 7:01 AM, Simão Afonso @ Powertools Tech wrote:
I'd like to avoid field() as much as possible. I think it's just too easy to miss what it's doing, since it has many arguments. And I'd like to make it easy to have a style that encourages all non-keyword-only fields to come first. Eric
-- Eric V. Smith

On Tue, Mar 16, 2021 at 9:23 AM Eric V. Smith <eric@trueblade.com> wrote:
Here's another option I just thought of, that might have some strengths: @dataclasses.dataclass class LotsOfFields: # first member assumed to be kw_or_posit unless marked otherwise a: Any # start kw only here because `mark= dataclasses.KW_ONLY` b: Any = field(default=0, mark=dataclasses.KW_ONLY) # all members following above line continue to be kw_only c: Any = 'foo' d: Any e: Any = 0.0 f: Any g: Any = () h: Any = 'bar' i: Any = 3+4j j: Any = 10 k: Any I changed the kw argument name from `kw_only` to `mark`. The idea is that every supplied member (above, members c through k) is assumed to have the same `mark` as the most recently explicitly designated field (above, member b). Or, if there has not yet been an explicitly designated `mark` for a member, the first field (above, member a) is assumed to be `dataclasses.KW_OR_POSIT` (or whatever other singleton name makes sense). In this way we only have to use the field() call once, we can go back and forth, we eliminate a line like `_: dataclasses.KW_OPTIONAL` that to some people might be a bit inscrutable, and we leave the door open to support positional arguments in the future (perhaps through something like `mark= dataclasses.KW_OR_POSIT`) if it becomes desirable for some reason. --- Ricky. "I've never met a Kentucky man who wasn't either thinking about going home or actually going home." - Happy Chandler

On Tue, Mar 16, 2021 at 10:03 AM Eric V. Smith <eric@trueblade.com> wrote:
I think I prefer it, too. But what about using something like `mark=dataclasses.KW_ONLY` rather than `kw_only=True` (in the field constructor)? So it isn't an on/off switch and as to leave open the door for positional arguments? Is that door to be totally closed? I'm not here to argue in favor of it necessarily, just asking whether we really want to close the door more tightly against it than we have to.
--- Ricky. "I've never met a Kentucky man who wasn't either thinking about going home or actually going home." - Happy Chandler

On Tue, Mar 16, 2021 at 11:59 AM Eric V. Smith <eric@trueblade.com> wrote:
Here I am only trying to help by pointing out a few things that might be worth considering. However I am FULLY willing to be told "Rick: we've been designing language features a long time and, trust me, none of this a big deal." :) It doesn't prohibit it, but choosing `kw_only` as the argument name (in both dataclasses.dataclass and dataclasses.field) semantically limits it to only being used as an on/off switch for a kw-only setting. Later, if it became desirable to add positional only, you'd need to add another separate on/off setting for it (say, `posit_only`) and then you'd end up in situations where the flag combos could be in conflict, and you have to check for that conflict: kw_only=True, posit_only=True The above would be an invalid combination. So it doesn't prohibit it but it makes it a bit harder to add it later. And is there another possible "setting".... possibly a bit exotic, maybe something that would end up being specific to dataclasses... that would also exclude OR require one or both of either/or kw_only and posit_only...? I am not sure, but maybe there is. And in that case you'd end up with a third on/off switch, and some other set of possibly ponderous checking: kw_only=True, exotic_setting=True posit_only=True, exotic_setting=True --- Ricky. "I've never met a Kentucky man who wasn't either thinking about going home or actually going home." - Happy Chandler

On 3/16/2021 12:16 PM, Ricky Teachey wrote: this proposal that I've modified it to become this one.
That's what my previous proposal suggested, and I still think it's okay. I think "kw_only=True" is clearer than "mark=dataclasses.KW_ONLY". I don't think field(kw_only=True) will be used very often, and I think field(posit_only=True) would be used almost never. So I'm going to optimize the API for the more common usage. And I don't mind checking for invalid combinations. There's already plenty of that in dataclasses, just check out the charts at the top of dataclasses.py and look for "raise". There's also lots of this checking already in Python: open('foo', 'b', encoding='utf-8') Gives: ValueError: binary mode doesn't take an encoding argument Eric
-- Eric V. Smith

On 2021-03-16 06:20, Eric V. Smith wrote:
From my perspective it's quite the contrary. `field` is an actual function call and its arguments may affect its behavior in the way that arguments generally affect function calls. This thing with a dummy attribute holding a magic value whose sequential position in the class body influences *different* attributes based on their *relative* sequential position. . . I find that much more confusing and strange. I think Simão's version where you give a class-level default for keyword-only-ness and then override it with field() arguments where necessary is much cleaner. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown

On 3/16/21 1:19 PM, Brendan Barnwell wrote:
On 2021-03-16 06:20, Eric V. Smith wrote:
I agree with Eric. The class body is basically a vertical representation of the horizontal `__init__` header. As such, `KW_ONLY` and (maybe in the future) POS_ONLY match very nicely. -- ~Ethan~

Why don’t we also add dataclasses.POS_ONLY as well? Anything before it is positional argument. @dataclasses.dataclass class LotsOfFields: x: Any y: Any z: Any _:dataclasses.POS_ONLY a: Any __: dataclasses.KW_ONLY b: Any = 0 c: Any = 'foo' d: Any e: Any = 0.0 f: Any g: Any = () h: Any = 'bar' i: Any = 3+4j j: Any = 10 k: Any The generated __init__ would look like: def __init__(self, x, y, z, /, a, *, b=0, c='foo', d, e=0.0, f, g=(), h='bar', i=3+4j, j=10, k): Similar to dataclasses.KW_ONLY, you can use dataclasses.POS_ONLY only once. It should always come before dataclasses.KW_ONLY. Anything else, it should raise an error. If dataclasses.POS_ONLY came right after the class declaration (no instance attribute preceding it), an error is generated similar to what we have right now. Inheritance is build up (from super class to sub class) of three groups of arguments: pos_only arguments together, then mixed arguments, and then kw_only arguments. repr should work the same way. We would like to reconstruct the object from repr. We don’t want too much change from current behavior. Lets create a class that inherits from the above class.. @dataclasses.dataclass class LessFields(LotsOfFields): x2: Any _: dataclasses.POS_ONLY a2: Any __:dataclasses.KW_ONLY b2: Any The generated __init__ would look like: def __init__(self, x, y, z, x2, /, a, a2, *, b=0, c='foo', d, e=0.0, f, g=(), h='bar', i=3+4j, j=10, k, b2): Another class.. @dataclasses.dataclass Class EvenLessFields(LessFields): a3: Any _: dataclasses.KW_ONLY b3: Any = 9 The generated __init__ would look like: def __init__(self, x, y, z, x2, /, a, a2, a3, *, b=0, c='foo', d, e=0.0, f, g=(), h='bar', i=3+4j, j=10, k, b2, b3=9): All arguments are ordered from the super super class until the subclass all in their respective path or group of arguments. Abdulla

This was discussed on this thread: https://mail.python.org/archives/list/python-ideas@python.org/message/I3RKK4... Ultimately the thing that doomed it for me was the repr that would be required for eval(repr(dataclass_instance)) to work. You'd have to drop the field name, combined with re-ordering the parameters. It just seemed ugly, and it's not a battle I'm willing to fight while trying to keyword-only in to 3.10. But I believe this current proposal is a subset of that one, and it could be added in the future if demand is high. Also, note that attrs doesn't support this, and I don't think we need to be blazing a trail here. Eric On 3/16/2021 12:17 PM, Abdulla Al Kathiri wrote:
-- Eric V. Smith

A suggestion for ease of extension in the future: replace `field` parameter `kw_only` with `init_type`. As `field` is expected to be used less than `_: dataclasses.KW_ONLY` and `dataclass(kw_only=True)`, it can afford to be more verbose. On top of that, if you want to add positional-only fields in the future, with your current proposal for keyword-only fields would mean you'd have to add another parameter to `field` (eg `pos_only`). My suggestion is to instead make the parameter to `field` become `init_type`, which accepts an `enum` member (called `FieldInitParamType`, with members `normal` (or `positional_or_keyword`) and `keyword_only` (and in the future, `positional_only`)). This makes the parameter specifically about the type of the parameter for the data-class's `__init__`, and allows for any number of parameter types in the future without having to add new parameters to `field`. Laurie

On 3/17/2021 8:07 AM, Laurie O wrote:
I think kw_only=True is clearer as to the intent when reading the code. And it doesn't preclude adding pos_only=True. It's not hard to add parameters to field(). I also think it's extremely unlikely we'll ever add positional arguments, or any other kind of argument (not sure what that would be, but anything's possible in the future!). I sort of like having "init" in the parameter name, but then again there are no other member functions added by dataclasses that deal with fields as parameters, so there's really nothing else that kw_only could refer to. I guess to be more explicit it would be "init_arg_type". And there's some value in doing what attrs has done, although I'll grant that I've veered from what attrs has done before (and even did so in this very proposal with dataclasses.KW_ONLY). But they have a lot of experience with it, and I've looked for and not seen any complaints or confusion. As to using actual Enums, it's been avoided because of the expense of importing the enum module. Although I'll admit I haven't seen timings of that recently, so maybe that's not a valid argument any more. Something to think about: how would positional arguments work with pattern matching? Eric

I've created https://bugs.python.org/issue43532 for this. PR is https://github.com/python/cpython/pull/24909. I still need to add a few more tests and update the documentation. Eric On 3/15/2021 5:18 PM, Eric V. Smith wrote:
-- Eric V. Smith

Good proposal! I have a few questions. On Mon, Mar 15, 2021 at 2:22 PM Eric V. Smith <eric@trueblade.com> wrote:
Can you be specific and show what the repr() would be? E.g. if I create C(1, 2, b=3, d=4) the repr() be C(a=1, b=3, c=2, d=4), right?
This is the same behavior that attrs uses.
Nevertheless I made several typos trying to make the examples in my sentence above correct. Perhaps we could instead disallow mixing kw-only and regular args? Do you know why attrs does it this way?
This can be simulated by flattening the inheritance tree and adding explicit field(kw_only=True) to all fields of classes using kw_only=True in the class decorator as well as all fields affected by _: KW_ONLY, right? So the above would behave like this: @dataclasses.dataclass class F: a: Any = field(kw_only=True) b: Any c: Any = field(kw_only=True) which IIUC indeed gives the same __init__ signature and repr().
I don't think it is very controversial, do you? Then again maybe you should ask a SC member if they would object. mypy and other type checkers would need to be taught about all of this.
Yeah, that's true. But the type checkers have bigger fish to fry (e.g. pattern matching). -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

On 3/15/2021 7:45 PM, Guido van Rossum wrote: this issue. My inclination would be to allow it as I described here, but suggest against it in the documentation.
Correct. I was just trying to show that you can get in to the same mess with alternating normal and kwarg fields (and the subsequent re-ordering of them in __init__) just by using inheritance, and only using dataclass(kw_only=True). That is, if you don't use field(kw_only=True) or KW_ONLY, you still can have the re-ordering issue via inheritance. I should probably show these examples using multiple inheritance of possibly unrelated base classes, instead of showing a chain of single inheritance.
Well, the version with positional arguments seemed pretty controversial! But I agree that this particular version of it is benign and gives many people (including me!) what they want. I'll reach out to some SC members once I get this firmed up.
I think this wouldn't be too difficult for type checkers to get right, I just wanted to note that they're impacted. Eric
-- Eric V. Smith

And now I have a question for you, Guido. I'm looking at the code and I see the additions for __match_args__. Is there any bad interaction between this proposal and the match statement? I assume __match_args__ be the re-ordered arguments to __init__, but I want to make sure. So this: @dataclasses.dataclass class C: a: Any b: Any = field(kw_only=True) c: Any d: Any = field(kw_only=True) Which generates: def __init__(self, a, c, *, b, d): Would have __match_args__ equal to ('a', 'c', 'b', 'd'), right? Even though the repr would have fields in order a, b, c, d. Eric On 3/15/2021 7:45 PM, Guido van Rossum wrote:
-- Eric V. Smith

That's a really good question. I think that `__match_args__` should *only* contain the non-kw-only args, so that you would have to write `case C(a, c, b=b, d=d)` to match on all four attributes (but typically you'd probably just write `case C(a, c)` -- that's about the same as `case C(a, c, b=_, d=_)` except it doesn't even care whether b and d are set at all. On Tue, Mar 16, 2021 at 1:36 PM Eric V. Smith <eric@trueblade.com> wrote:
-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

On 3/16/2021 7:01 AM, Simão Afonso @ Powertools Tech wrote:
I'd like to avoid field() as much as possible. I think it's just too easy to miss what it's doing, since it has many arguments. And I'd like to make it easy to have a style that encourages all non-keyword-only fields to come first. Eric
-- Eric V. Smith

On Tue, Mar 16, 2021 at 9:23 AM Eric V. Smith <eric@trueblade.com> wrote:
Here's another option I just thought of, that might have some strengths: @dataclasses.dataclass class LotsOfFields: # first member assumed to be kw_or_posit unless marked otherwise a: Any # start kw only here because `mark= dataclasses.KW_ONLY` b: Any = field(default=0, mark=dataclasses.KW_ONLY) # all members following above line continue to be kw_only c: Any = 'foo' d: Any e: Any = 0.0 f: Any g: Any = () h: Any = 'bar' i: Any = 3+4j j: Any = 10 k: Any I changed the kw argument name from `kw_only` to `mark`. The idea is that every supplied member (above, members c through k) is assumed to have the same `mark` as the most recently explicitly designated field (above, member b). Or, if there has not yet been an explicitly designated `mark` for a member, the first field (above, member a) is assumed to be `dataclasses.KW_OR_POSIT` (or whatever other singleton name makes sense). In this way we only have to use the field() call once, we can go back and forth, we eliminate a line like `_: dataclasses.KW_OPTIONAL` that to some people might be a bit inscrutable, and we leave the door open to support positional arguments in the future (perhaps through something like `mark= dataclasses.KW_OR_POSIT`) if it becomes desirable for some reason. --- Ricky. "I've never met a Kentucky man who wasn't either thinking about going home or actually going home." - Happy Chandler

On Tue, Mar 16, 2021 at 10:03 AM Eric V. Smith <eric@trueblade.com> wrote:
I think I prefer it, too. But what about using something like `mark=dataclasses.KW_ONLY` rather than `kw_only=True` (in the field constructor)? So it isn't an on/off switch and as to leave open the door for positional arguments? Is that door to be totally closed? I'm not here to argue in favor of it necessarily, just asking whether we really want to close the door more tightly against it than we have to.
--- Ricky. "I've never met a Kentucky man who wasn't either thinking about going home or actually going home." - Happy Chandler

On Tue, Mar 16, 2021 at 11:59 AM Eric V. Smith <eric@trueblade.com> wrote:
Here I am only trying to help by pointing out a few things that might be worth considering. However I am FULLY willing to be told "Rick: we've been designing language features a long time and, trust me, none of this a big deal." :) It doesn't prohibit it, but choosing `kw_only` as the argument name (in both dataclasses.dataclass and dataclasses.field) semantically limits it to only being used as an on/off switch for a kw-only setting. Later, if it became desirable to add positional only, you'd need to add another separate on/off setting for it (say, `posit_only`) and then you'd end up in situations where the flag combos could be in conflict, and you have to check for that conflict: kw_only=True, posit_only=True The above would be an invalid combination. So it doesn't prohibit it but it makes it a bit harder to add it later. And is there another possible "setting".... possibly a bit exotic, maybe something that would end up being specific to dataclasses... that would also exclude OR require one or both of either/or kw_only and posit_only...? I am not sure, but maybe there is. And in that case you'd end up with a third on/off switch, and some other set of possibly ponderous checking: kw_only=True, exotic_setting=True posit_only=True, exotic_setting=True --- Ricky. "I've never met a Kentucky man who wasn't either thinking about going home or actually going home." - Happy Chandler

On 3/16/2021 12:16 PM, Ricky Teachey wrote: this proposal that I've modified it to become this one.
That's what my previous proposal suggested, and I still think it's okay. I think "kw_only=True" is clearer than "mark=dataclasses.KW_ONLY". I don't think field(kw_only=True) will be used very often, and I think field(posit_only=True) would be used almost never. So I'm going to optimize the API for the more common usage. And I don't mind checking for invalid combinations. There's already plenty of that in dataclasses, just check out the charts at the top of dataclasses.py and look for "raise". There's also lots of this checking already in Python: open('foo', 'b', encoding='utf-8') Gives: ValueError: binary mode doesn't take an encoding argument Eric
-- Eric V. Smith

On 2021-03-16 06:20, Eric V. Smith wrote:
From my perspective it's quite the contrary. `field` is an actual function call and its arguments may affect its behavior in the way that arguments generally affect function calls. This thing with a dummy attribute holding a magic value whose sequential position in the class body influences *different* attributes based on their *relative* sequential position. . . I find that much more confusing and strange. I think Simão's version where you give a class-level default for keyword-only-ness and then override it with field() arguments where necessary is much cleaner. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown

On 3/16/21 1:19 PM, Brendan Barnwell wrote:
On 2021-03-16 06:20, Eric V. Smith wrote:
I agree with Eric. The class body is basically a vertical representation of the horizontal `__init__` header. As such, `KW_ONLY` and (maybe in the future) POS_ONLY match very nicely. -- ~Ethan~

Why don’t we also add dataclasses.POS_ONLY as well? Anything before it is positional argument. @dataclasses.dataclass class LotsOfFields: x: Any y: Any z: Any _:dataclasses.POS_ONLY a: Any __: dataclasses.KW_ONLY b: Any = 0 c: Any = 'foo' d: Any e: Any = 0.0 f: Any g: Any = () h: Any = 'bar' i: Any = 3+4j j: Any = 10 k: Any The generated __init__ would look like: def __init__(self, x, y, z, /, a, *, b=0, c='foo', d, e=0.0, f, g=(), h='bar', i=3+4j, j=10, k): Similar to dataclasses.KW_ONLY, you can use dataclasses.POS_ONLY only once. It should always come before dataclasses.KW_ONLY. Anything else, it should raise an error. If dataclasses.POS_ONLY came right after the class declaration (no instance attribute preceding it), an error is generated similar to what we have right now. Inheritance is build up (from super class to sub class) of three groups of arguments: pos_only arguments together, then mixed arguments, and then kw_only arguments. repr should work the same way. We would like to reconstruct the object from repr. We don’t want too much change from current behavior. Lets create a class that inherits from the above class.. @dataclasses.dataclass class LessFields(LotsOfFields): x2: Any _: dataclasses.POS_ONLY a2: Any __:dataclasses.KW_ONLY b2: Any The generated __init__ would look like: def __init__(self, x, y, z, x2, /, a, a2, *, b=0, c='foo', d, e=0.0, f, g=(), h='bar', i=3+4j, j=10, k, b2): Another class.. @dataclasses.dataclass Class EvenLessFields(LessFields): a3: Any _: dataclasses.KW_ONLY b3: Any = 9 The generated __init__ would look like: def __init__(self, x, y, z, x2, /, a, a2, a3, *, b=0, c='foo', d, e=0.0, f, g=(), h='bar', i=3+4j, j=10, k, b2, b3=9): All arguments are ordered from the super super class until the subclass all in their respective path or group of arguments. Abdulla

This was discussed on this thread: https://mail.python.org/archives/list/python-ideas@python.org/message/I3RKK4... Ultimately the thing that doomed it for me was the repr that would be required for eval(repr(dataclass_instance)) to work. You'd have to drop the field name, combined with re-ordering the parameters. It just seemed ugly, and it's not a battle I'm willing to fight while trying to keyword-only in to 3.10. But I believe this current proposal is a subset of that one, and it could be added in the future if demand is high. Also, note that attrs doesn't support this, and I don't think we need to be blazing a trail here. Eric On 3/16/2021 12:17 PM, Abdulla Al Kathiri wrote:
-- Eric V. Smith

A suggestion for ease of extension in the future: replace `field` parameter `kw_only` with `init_type`. As `field` is expected to be used less than `_: dataclasses.KW_ONLY` and `dataclass(kw_only=True)`, it can afford to be more verbose. On top of that, if you want to add positional-only fields in the future, with your current proposal for keyword-only fields would mean you'd have to add another parameter to `field` (eg `pos_only`). My suggestion is to instead make the parameter to `field` become `init_type`, which accepts an `enum` member (called `FieldInitParamType`, with members `normal` (or `positional_or_keyword`) and `keyword_only` (and in the future, `positional_only`)). This makes the parameter specifically about the type of the parameter for the data-class's `__init__`, and allows for any number of parameter types in the future without having to add new parameters to `field`. Laurie

On 3/17/2021 8:07 AM, Laurie O wrote:
I think kw_only=True is clearer as to the intent when reading the code. And it doesn't preclude adding pos_only=True. It's not hard to add parameters to field(). I also think it's extremely unlikely we'll ever add positional arguments, or any other kind of argument (not sure what that would be, but anything's possible in the future!). I sort of like having "init" in the parameter name, but then again there are no other member functions added by dataclasses that deal with fields as parameters, so there's really nothing else that kw_only could refer to. I guess to be more explicit it would be "init_arg_type". And there's some value in doing what attrs has done, although I'll grant that I've veered from what attrs has done before (and even did so in this very proposal with dataclasses.KW_ONLY). But they have a lot of experience with it, and I've looked for and not seen any complaints or confusion. As to using actual Enums, it's been avoided because of the expense of importing the enum module. Although I'll admit I haven't seen timings of that recently, so maybe that's not a valid argument any more. Something to think about: how would positional arguments work with pattern matching? Eric

I've created https://bugs.python.org/issue43532 for this. PR is https://github.com/python/cpython/pull/24909. I still need to add a few more tests and update the documentation. Eric On 3/15/2021 5:18 PM, Eric V. Smith wrote:
-- Eric V. Smith
participants (8)
-
Abdulla Al Kathiri
-
Brendan Barnwell
-
Eric V. Smith
-
Ethan Furman
-
Guido van Rossum
-
Laurie O
-
Ricky Teachey
-
Simão Afonso @ Powertools Tech