Is static typing still optional?

The make_dataclass() factory function in the dataclasses module currently requires type declarations. It would be nice if the type declarations were optional. With typing (currently works): Point = NamedTuple('Point', [('x', float), ('y', float), ('z', float)]) Point = make_dataclass('Point', [('x', float), ('y', float), ('z', float)]) Without typing (only the first currently works): Point = namedtuple('Point', ['x', 'y', 'z']) # underlying store is a tuple Point = make_dataclass('Point', ['x', 'y', 'z']) # underlying store is an instance dict This proposal would make it easy to cleanly switch between the immutable tuple-based container and the instancedict-based optionally-frozen container. The proposal would make it possible for instructors to teach dataclasses without having to teach typing as a prerequisite. And, it would make dataclasses usable for projects that have elected not to use static typing. Raymond

Thanks Eric and Ivan. You're both very responsive. I appreciate the enormous efforts you're putting in to getting this right. I suggest two other fix-ups: 1) Let make_dataclass() pass through keyword arguments to _process_class(), so that this will work: Point = make_dataclass('Point', ['x', 'y', 'z'], order=True) 2) Change the default value for "hash" from "None" to "False". This might take a little effort because there is currently an oddity where setting hash=False causes it to be hashable. I'm pretty sure this wasn't intended ;-) Raymond

On 12/10/2017 5:00 PM, Raymond Hettinger wrote:
Thank you for your feedback. It's very helpful. I see a couple of options: 1a: Use a default type annotation, if one is not is supplied. typing.Any would presumably make the most sense. 1b: Use None if not type is supplied. 2: Rework the code to not require annotations at all. I think I'd prefer 1a, since it's easy. However, typing is not currently imported by dataclasses.py. There's an argument that it really needs to be, and I should just bite the bullet and live with it. Possibly with Ivan's PEP 560 work my concern on importing typing goes away. 1b would be easy, but I don't like using non-types for annotations. 2 would be okay, but then that would be the only time __annotations__ wouldn't be set on a dataclass.
Agreed.
2) Change the default value for "hash" from "None" to "False". This might take a little effort because there is currently an oddity where setting hash=False causes it to be hashable. I'm pretty sure this wasn't intended ;-)
It's sufficiently confusing that I need to sit down when I have some free time and noodle this through. But it's still on my radar. Eric.

. I see a couple of options: 1a: Use a default type annotation, if one is not is supplied. typing.Any would presumably make the most sense. 1b: Use None if not type is supplied. 2: Rework the code to not require annotations at all. I think I'd prefer 1a, since it's easy. 2) would be great :-) I find this bit of “typing creep” makes me nervous— Typing should Never be required! I understand that the intent here is that the user could ignore typing and have it all still work. But I’d rather is was not still there under the hood. Just because standardized way to do something is included in core Python doesn’t mean the standard library has to use it. However, typing is not currently imported by dataclasses.py. And there you have an actual reason besides my uneasiness :-) - CHB

I'm not sure what Mail User Agent each of you is using, but it is quite impossible (here) to make out who is saying what in your latest messages. See plain text rendering here: https://mail.python.org/pipermail/python-dev/2017-December/151274.html Regards Antoine. On Fri, 15 Dec 2017 10:56:28 +0000 Steve Holden <steve@holdenweb.com> wrote:

On 12/15/2017 5:56 AM, Steve Holden wrote:
[Agreed with Antoine on the MUA and quoting being confusing.] The only reason typing isn't imported is performance. I hope that once PEP 560 is complete this will no longer be an issue, and dataclasses will always import typing. But of course typing will still not be needed for most uses of @dataclass or make_dataclass(). This is explained in the PEP. Eric.

Sorry about the email mangling -- I do a lot of my listserve work on the bus on an iPhone, with the built -in mail client -- and it REALLY sucks for doing interspersed email replying -- highly encouraging the dreaded top posting... But anyway, I think both Steve and I were expressing concerns about "Typing Creep". Typing should always be optional in Python, and while this PEP does keep it optional, Steve's point was that the code in the standard library serves not only as a library, but as examples of how to write "robust" python code. The rest of this note is me -- I'm not pretending ot speak for Steve. Reading the PEP, this text makes me uneasy: "A field is defined as any variable identified in__annotations__. That is, a variable that has a type annotation." And if I understand the rest of the PEP, while typing itself is optional, the use of type Annotation is not -- it is exactly what's being used to generate the fields the user wants. And the examples are all using typing -- granted, primarily the built in types, but still: @dataclass class C: a: int # 'a' has no default value b: int = 0 # assign a default value for 'b' This sure LOOKS like typing is required. It also makes me nervous because, as I understand it, the types aren't actually used in the implementation (presumable they would be by mypy and the like?). So I think for folks that aren't using typing and a type checker in their development process, it would be pretty confusing that this means and what it actually does. Particularly folks that are coming from a background of a statically typed language. Then I see: """ Field objects describe each defined field. ... Its documented attributes are: name: The name of the field. type: The type of the field. ... """ So again, typing looks to be pretty baked in to the whole concept. and then: """ One place where dataclass actually inspects the type of a field is to determine if a field is a class variable as defined in PEP 526. """ and """ The other place where dataclass inspects a type annotation is to determine if a field is an init-only variable. It does this by seeing if the type of a field is of type dataclasses.InitVar. """ """ Data Classes will raise a TypeError if it detects a default parameter of type list, dict, or set. """ So: it seems that type hinting, while not required to use Data Classes, is very much baked into the implementation an examples. As I said -- this makes me uneasy -- It's a very big step that essentially promotes the type hinting to a new place in Python -- you will not be able to use a standard library class without at least a little thought about types and typing. I note this: """ This discussion started on python-ideas [9] and was moved to a GitHub repo [10] for further discussion. As part of this discussion, we made the decision to use PEP 526 syntax to drive the discovery of fields. """ I confess I only vaguely followed that discussion -- in fact, mostly I thought that the concept of Data Classes was a good one, and was glad to see SOMETHING get implemented, and didn't think I had much to contribute to the details of how it was done. So these issues may have already been raised and considered, so carry on. But: NOTE: from PEP 526: "Python will remain a dynamically typed language, and the authors have no desire to ever make type hints mandatory, even by convention. " The Data Classes implementation is not making it mandatory by any means, but it is making it a more "standard" part of the language that can not simply be ignored anymore. And it seems some features of dataclasses can only be accessed via actual typing, in addition to the requirement of type annotations. If nothing else, the documentation should make it very clear that the typing aspects of Data Classes is indeed optional, and preferably give some untyped examples, something like: @dataclass class C: a: None # 'a' has no default value b: None = 0 # assign a default value for 'b' If, in fact, that would be the way to do it. -Chris On Fri, Dec 15, 2017 at 3:22 AM, Eric V. Smith <eric@trueblade.com> wrote:
-- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

One other note (see my last message). The PEP should include a summary of the discussion of the decision to use the type annotation syntax vs other options. I just looked through all the gitHub issues and found nothing, and started to look at the python-ideas list archive and got overwhelmed. So having that justification in the PEP would be good. -CHB On Fri, Dec 15, 2017 at 12:07 PM, Chris Barker <chris.barker@noaa.gov> wrote:
-- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On 15 December 2017 at 20:07, Chris Barker <chris.barker@noaa.gov> wrote:
I actually don't have any problem with this. It looks natural to me, reads perfectly fine, and is a far better way of defining fields than many of the other approaches that I've seen in the past (that don't use annotations). The one thing I would find surprising is that the actual type used is ignored. @dataclass class C: a: str = 0 AIUI this is valid, but it looks weird to me. There's an easy answer, though - just don't do that.
Well, being able to see the type the class author intended is a feature. I don't know I'd consider that as meaning typing is "baked in". It's useful but ignorable data.
Those are somewhat more explicit cases of directly using type annotations as declarations. But what alternative would you propose? It still seems fine to me.
Doesn't that mean that @dataclass class C: a: int = [] raises an error? The problem here is the same as that of mutable function default parameters - we don't want every instance of C to share a single list object as their default value for a. It's got nothing to do with the annotation (that's why I used the deliberately-inconsistent annotation of int here). I'm a strong +1 on making this an error, as it's likely to be an easy mistake to make, and quite hard to debug.
So: it seems that type hinting, while not required to use Data Classes, is very much baked into the implementation an examples.
Annotations and the annotation syntax are fundamental to the design. But that's core Python syntax. But I wouldn't describe types as being that significant to the design, it's more "if you supply them we'll make use of them". Don't forget, function parameter annotations were around long before typing. Variable annotations weren't, but they could have been - it's just that typing exposed a use case for them. Data classes could just as easily have been the motivating use case for PEP 526.
I will say that while I don't use typing or mypy at all in my code, I don't have any particular dislike of the idea of typing, or the syntax for declaring annotations. So I find it hard to understand your concerns here. My personal uneasiness is actually somewhat the opposite - I find it disconcerting that if I annotate a variable/parameter as having type int, nothing stops me assigning a string to it. But that's *precisely* what typing being optional means, so while it seems odd to my static typing instincts, it's entirely within the spirit of not forcing typing onto Python.
This does seem like a reasonable option to note. Something along the lines of "If you don't use type annotations in your code, and you want to avoid introducing them, using None as a placeholder for the type is sufficient". However, I suspect that using None as a "I don't really want to assign a type" value might well confuse mypy - I don't know. But using typing.Any (which is what mypy would expect) clearly doesn't meet the "avoid typing totally" requirement here. Maybe (mis-)using string annotations, like @dataclass class C: a: 'variable' # 'a' has no default value b: 'variable' = 0 # assign a default value for 'b' would work? But honestly, this feels like jumping through hoops purely to avoid using int "because it means I've bought into the idea of typing". I guess if you're that adamant about never wanting to use typing in your code, data classes would make you uncomfortable. But conversely, I don't see the value in making data classes clumsier than they need to be out of a purist principle to not use a perfectly valid Python syntax. Paul

On Sun, Dec 17, 2017 at 8:22 AM, Guido van Rossum <guido@python.org> wrote:
Mypy definitely won't like that use of annotation, but documentation systems might. For example, in a hover tooltip in an IDE/editor, it's probably more helpful to see the descriptive message than "int" or "float" for the attribute. What about data that isn't built-in scalars? Does this look right to people (and will mypy be happy with it)? @dataclass class C: a:numpy.ndarray = numpy.random.random((3,3)) b:MyCustomClass = MyCustomClass("foo", 37.2, 1+2j) I don't think those look terrible, but I think this looks better: @dataclass class C: a:Infer = np.random.random((3,3)) b:Infer = MyCustomClass("foo", 37.2, 1+2j) Where the name 'Infer' (or some other spelling) was a name defined in the `dataclasses` module. In this case, I don't want to use `typing.Any` since I really do want "the type of thing the default value has." -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

Good Bad or Neutral, this discussion makes my point: Using typing annotation as a necessary part of a standard library module is injecting typing into "ordinary" python in a new way. It is no longer going to appear to be completely optional, and only of concern to those that choose to use it (and mypy or similar). And I do think it is really bad UI to have something like: @dataclass class C: a: Int = 1 b: float = 1.0 be the recommended (and shown in all the examples, and really be almost the only way) to define a dataclass, when the type will in fact be completely ignored by the implementation. Newbies are going to be confused by this -- they really are. Anyway, clearly I personally don't think this is a very good idea, but I see that annotations are a natural and easy way to express the fields without adding any new syntax. But most importantly I don't think this should become standard without consideration of the impact and a deliberate decision to do so. A note: I don't know who everyone is that was engaged in the gitHub discussion working out the details, but at least a few core folks are very engaged in the introduction of type hinting to Python in general -- so I think a certain perspective may have been over-represented. Are there other options?? plain old: @dataclass class C: a = 1 b = 1.0 would work, though then there would be no way to express fields without defaults: @dataclass class C: a = 1 b = None almost -- but they is there "no default" or is the default None Would it be impossible to use the annotation syntax, but with the type optional: @dataclass class C: a : = 1 # filed with default value b : # field with no default This is not legal python now, but are there barriers other than not wanting to make yet more changes to it being legal (i.e. hard/impossible to unambiguously parse, etc. Maybe this can all be addresses by more "Untyped" examples the docs. -CHB On Sun, Dec 17, 2017 at 8:54 AM, David Mertz <mertz@gnosis.cx> wrote:
-- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

@David What you propose as `Infer` annotation was proposed some time ago (not only for dataclasses, there are other use cases). The discussion is here https://github.com/python/typing/issues/276 @Chris People are still allowed not to use dataclasses if they really don't like type hints :-) Seriously however, annotations are just syntax. In this sense PEP 526 is more like PEP 3107, and less like PEP 484. People are still free to write: @dataclass class C: x: "first coordinate" y: "second coordinate" plus: "I don't like types" or @dataclass class C: x: ... y: ... I don't see so big difference between hypothesis (testing lib) using annotations for their purposes from the situation with dataclasses. It is true that the syntax was chosen to simplify support in static type checkers (partially because users were often asking for such feature), but not more than this. If you don't use type checkers, there is no problem in using one of the above forms. If you have ideas about how to improve the dataclass docs, this can be discussed in the issue https://bugs.python.org/issue32216
... the type will in fact be completely ignored by the implementation. Newbies are going to be confused by this -- they really are.
This is not different from def f(x: int): pass f("What") # OK that exists starting from Python 3.0. Although I agree this is confusing, the way forward could be just explaining this better in the docs. If you want my personal opinion about the current situation about type hints _in general_, then I can say that I have seen many cases where people use type hints where they are not needed (for example in 10 line scripts or in highly polymorphic functions), so I agree that some community style guidance (like PEP 8) may be helpful. I had started such project an the end of last year (it was called pep-555, but I didn't have time to work on this and this number is already taken). -- Ivan

I'm really surprised no one seems to get my point here. TL;DR: My point is that having type annotation syntax required for something in the stdlib is a significant step toward "normalizing" type hinting in Python. Whether that's a good idea or not is a judgement call, but it IS a big step. @Chris
Well, yes, of course, but this is not like PEP 3107, as it introduces a requirement for annotations (maybe not *type* annotations per se) in the std lib. Again, that may be the best way to go -- but it should be done deliberately. @dataclass
Ah! I had no idea you could use ellipses to indicate no type. That actually helps a lot. We really should have that prominent in the docs. And in the dataclass docs, not just the type hinting docs -- again, people will want to use these that may not have any interest in nor prior knowledge of type hints.
The big difference is that hypothesis is not in the standard library. Also, I didn't know about hypothesis until just now, but their very first example in the quick start does not use annotation syntax, so it's not as baked in as it is with dataclasses.
If you have ideas about how to improve the dataclass docs, this can be discussed in the issue https://bugs.python.org/issue32216
I'll try to find time to contribute there -- though maybe better to have the doc draft in gitHub?
Again the difference is that EVERY introduction to defining python functions doesn't use the type hint. And even more to the point, you CAN define a function without any annotations. But frankly, I think as type hinting becomes more common, we're going to see a lot of confusion :-( If you want my personal opinion about the current situation about type
It's going to get worse before it gets better :-( @dataclass
class C: x = field()
that does require that `field` be imported, so not as nice. I kinda like the ellipses better. but good to have a way. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On Dec 18, 2017, at 21:41, Chris Barker <chris.barker@noaa.gov> wrote:
TL;DR: My point is that having type annotation syntax required for something in the stdlib is a significant step toward "normalizing" type hinting in Python. Whether that's a good idea or not is a judgement call, but it IS a big step.
This is something we’re discussing for importlib.resources: https://bugs.python.org/issue32248#msg308495 In the standalone version, we’re using annotations for the Python 3 bits. It would make our lives easier if we kept them for the stdlib version (applying diffs and keeping them in sync would be easier). Brett says in the follow up: "As for the type hints, I thought it was lifted such that new code could include it but we wouldn't be taking PRs to add them to pre-existing code?” So, what’s the deal? -Barry

On 12/18/2017 9:41 PM, Chris Barker wrote:
I get your point, I'm just not concerned about it. I also don't think it's surprising that you can put misleading information (including non-types) in type annotations. All of the documentation and discussions are quite clear that type information is ignored at runtime. It _is_ true that @dataclass does actually inspect the type at runtime, but those uses are very rare. And if you do need them, the actual type T used by ClassVar[T] and InitVar[T] are still ignored. Data Classes is also not the first use of type annotations in the stdlib: https://docs.python.org/3/library/typing.html#typing.NamedTuple When I say that "typing is optional", I mean importing the typing module, not that annotations are optional. Eric.

On 19 December 2017 at 07:49, Eric V. Smith <eric@trueblade.com> wrote:
Data Classes is also not the first use of type annotations in the stdlib: https://docs.python.org/3/library/typing.html#typing.NamedTuple
Also, the fact that no-one raised this issue during the whole time the PEP was being discussed (at least as far as I recollect) and that Guido (who of all of us should be most aware of what is and isn't acceptable use of annotations in the stdlib) approved the PEP, suggests to me that this isn't that big a deal. The only thing that has surprised me in this discussion is that the actual type used in the annotation makes no difference. And once someone reminded me that types are never enforced at runtime (you can call f(x: int) with f('haha')) that seemed fine. Paul

On Tue, Dec 19, 2017 at 10:53 AM, Paul Moore <p.f.moore@gmail.com> wrote:
If anything, this makes things more difficult for the learner. The fact that annotations are formally undefined as to anything but syntax is sensible but can be misleading (as the example above clearly shows). In the typing module it's logical to see annotations, I guess. But I really hope they aren't sprinkled around willy-nilly. Sooner or later there will be significant demand for annotated libraries, even though CPython will perform exactly as it does with non-annotated code. I can see the value of annotations in other environments and for different purposes, but it would be a pity if this were to unnecessarily complicate the stdlib. regards Steve

On 2017-12-19 02:53, Paul Moore wrote:
Hi, I asked about this in the first posting of the PEP and agree with Chris. https://mail.python.org/pipermail/python-dev/2017-September/149406.html There is definitely a passive bias towards using types with dataclasses in that the Eric (the author) doesn't appear to want an example without them in the pep/docs. It seems that typing proponents are sufficiently enamored with them that they can't imagine anyone else feeling differently, haha. Personally, I wouldn't use types with Python unless I was leading a large project with a large team of folks with different levels of experience. That's where types shine, and those folks might be better served by Java or Kotlin. So we hearing that "types are optional" while the docs may imply the opposite. Liked the ellipsis since None is often used as a sentinel value and an extra import is a drag. -Mike

On 12/20/2017 6:57 PM, Mike Miller wrote:
I'm not sure what such an example would look like. Do you mean without annotations? Or do you mean without specifying the "correct" type, like: @dataclass class C: x: int = 'hello world' ? Or something else? Can you provide an example of what you'd like to see?
It seems that typing proponents are sufficiently enamored with them that they can't imagine anyone else feeling differently, haha.
I've never used typing or mypy, so you're not talking about me. I do like the conciseness that annotations bring to dataclasses, though. If you buy that (and you might not), then I don't see the point of not using a correct type annotation. Eric.

On 12/20/2017 8:13 PM, Eric V. Smith wrote:
Re-reading my post you referenced, is it just an example using typing.Any? I'm okay with that in the docs, I just didn't want to focus on it in the PEP. I want the PEP to only have the one reference to typing, for typing.ClassVar. I figure the people reading the PEP can extrapolate to all of the possible uses for annotations that they don't need to see a typing.Any example. Eric.

On Wed, Dec 20, 2017 at 5:29 PM, Eric V. Smith <eric@trueblade.com> wrote:
IIUC, there is not way to make a dataclass without annotations, yes? That it, using annotations to determine the fields is the one and only way the decorator works. So it's impossible to give an example without annotations, yes?
It may be a good idea to have an example like that in the docs (but probably not the PEP) to make it clear that the type is not used in any way at run time. But I don't think that anyone is suggesting that would be a recommended practice. I suggest that it be clear in the docs, and ideally in the PEP, that the dataclass decorator is using the *annotation" syntax, and that the the only relevant part it uses is that an annotation exists, but the value of the annotation is essentially (completely?) ignored. So we should have examples like: @dataclass class C: a: ... # field with no default b: ... = 0 # filed with a default value Then maybe: @dataclass class C: a: "the a parameter" # field with no default b: "another, different parameter" = 0.0 # field with a default Then the docs can go to say that if the user wants to specify a type for use with a static type checking pre-processor, they can do it like so: @dataclass class C: a: int # integer field with no default b: float = 0.0 # float field with a default And the types will be recognized by type checkers such as mypy. And I think the non-typed examples should go first in the docs. This is completely analogous to how all the other parts of python are taught. Would anyone suggest that the very first example of a function definition that a newbie sees would be: def func(a: int, b:float = 0.0): body_of_function Then, _maybe_ way down on the page, you mention that oh, by the way, those types are completely ignored by Python. And not even give any examples without types?
Re-reading my post you referenced, is it just an example using typing.Any?
I actually think that is exactly the wrong point -- typing.Any is still using type hinting -- it's an explicit way to say, "any type will do", but it's only relevant if you are using a type checker. We really need examples for folks that don't know or care about type hinting at all. typing.Any is for use by people that are explicitly adding type hinting, and should be discussed in type hinting documentation. people reading the PEP can
no they don't, but they DO need to see examples without type hints at all. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On 12/21/2017 1:46 AM, Chris Barker wrote:
Correct. Well, you will be able to use make_dataclass() without type information after I fix bpo-32278, but most users won't be using that.
I think the PEP is very clear about this: "The dataclass decorator examines the class to find fields. A field is defined as any variable identified in __annotations__. That is, a variable that has a type annotation. With two exceptions described below, none of the Data Class machinery examines the type specified in the annotation." I agree the docs should also be clear about this.
I'll leave this for others to decide. The docs, and how approachable they are to various audiences, isn't my area of expertise.
I'm not opposed to this in the documentation. Maybe we should decide on a convention on what to use to convey "don't care". I've seen typing.Any, None, ellipsis, strings, etc. all used. Eric.

On 12/21/2017 4:22 AM, Eric V. Smith wrote:
On 12/21/2017 1:46 AM, Chris Barker wrote:
This seems clear enough. It could come after describing what a dataclass *is*.
I agree the docs should also be clear about this.
Module some bike-shedding, the above seems pretty good to me.
I'll leave this for others to decide. The docs, and how approachable they are to various audiences, isn't my area of expertise.
-- Terry Jan Reedy

On 12/21/17 6:25 AM, Sven R. Kunze wrote:
Because you can't know the order that x and y are defined in this example: class C: x: int y = 0 'x' is not in C.__dict__, and 'y' is not in C.__annotations__. Someone will suggest a metaclass, but that has its own problems. Mainly, interfering with other metaclasses. Eric.

On 12/21/2017 9:23 AM, Eric V. Smith wrote:
I think the understanding problem with this feature arises from two factors: using annotations to define possibly un-initialized slots is non-obvious; a new use of annotations for something other than static typing is a bit of a reversal of the recent pronouncement 'annotations should only be used for static typing'. Therefore, getting the permanent doc 'right' is important. The following naively plausible alternative does not work and cannot sensibly be made to work because the bare 'x' in the class scope, as opposed to a similar error within a method, causes NameError before the class is created. @dataclass class C: x y = 0 I think the doc should explicitly say that uninitialized fields require annotation with something (anything, not necessarily a type) simply to avoid NameError during class creation. It may not be obvious to some readers why x:'anything' does not also raise NameError, but that was a different PEP, and the dataclass doc could here link to wherever name:annotation in bodies is explained. -- Terry Jan Reedy

On Thu, Dec 21, 2017 at 7:55 PM, Terry Reedy <tjreedy@udel.edu> wrote:
Solely because, annotations being optional, the interpreter is not allowed to infer from its presence that an annotated name should be allocated an entry in __dict__, and clearly the value associated with it would be problematical. I think the understanding problem with this feature arises from two
Indeed. So annotations are optional, except where they aren't?
This contortion is why I feel a better solution would be desirable. Alas I do not have one to hand. regards Steve

On Thu, Dec 21, 2017 at 11:55 AM, Terry Reedy <tjreedy@udel.edu> wrote: I think the understanding problem with this feature arises from two
you know, that may be where part of my confusion came from -- all the talk lately has been about "type hints" and "type annotations" -- the idea of "arbitrary annotations" has been lost.
Therefore, getting the permanent doc 'right' is important.
yup.
would this be possible? @dataclass class C: x: y: = 0 That is -- the colon indicates an annotation, but in this case, it's a "nothing" annotation. It's a syntax error now, but would it be possible to change that? Or would the parsing be ambiguous? particularly in other contexts. of course, then we'd need something to store in as a "nothing" annotation -- empty string? None? (but None might mean something) create yet anther type for "nothing_annotation" Hmm, I may have talked myself out of it.... -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On Thu, Dec 21, 2017 at 3:10 PM MRAB <python@mrabarnett.plus.com> wrote:
pass does not currently parse in that context. Otherwise I was thinking the same thing. But we already have ... which does - so I'd suggest that for people who are averse to importing anything from typing and using the also quite readable Any. (ie: document this as the expected practice with both having the same meaning) While I consider the annotation to be a good feature of data classes, it seems worth documenting that people not running a type analyzer should avoid declaring a type. A worse thing than no-type being specified is a wrong type being specified. That appearing in a library will break people who need their code to pass the analyzer and pytype, mypy, et. al. could be forced to implement a typeshed.pypi of sorts containing blacklists of known bad annotations in public libraries and/or actually correct type specification overrides for them. As for problems with order, if we were to accept @dataclass class Spam: beans = True ham: bool style instead, would it be objectionable to require keyword arguments only for dataclass __init__ methods? That'd get rid of the need to care about order. (but would annoy people with small 2-3 element data classes... so I'm assuming this idea is already rejected) -gps

On Thu, Dec 21, 2017 at 3:36 PM, Gregory P. Smith <greg@krypto.org> wrote:
I don't think they do, actually - I haven't been following the typing discussions, but someone in this thread said that ... means "use the type of teh default" or something like that.
+1 !
and the wrong type could be very common -- folks using "int", when float would do just fine, or "list" when any iterable would do, the list goes on and on. Typing is actually pretty complex in Python -- it's hard to get right, and if you aren't actually running a type checker, you'd never know. One challenge here is that annotations, per se, aren't only for typing. Bu tit would be nice if a type checker could see whatever "non-type" is recommended for dataclasses as "type not specified". Does an ellipses spell that? or None? or anything that doesn't have to be imported from typing :-) As for problems with order, if we were to accept
wouldn't that make the "ham: bool" legal -- i.e. no default? -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

(subject for this sub-thread updated) On Thu, Dec 21, 2017 at 4:08 PM Chris Barker <chris.barker@noaa.gov> wrote:
indeed, they may not. though if that is the definition is it reasonable to say that type analyzers recognize the potential recursive meaning when the _default_ is ... and treat that as Any? another option that crossed my mind was "a: 10" without using =. but that really abuses __attributes__ by sticking the default value in there which the @dataclass decorator would presumably immediately need to undo and fix up before returning the class. but I don't find assigning a value without an = sign to be pythonic so please lets not do that! :)
Yeah, that is true. int vs float vs Number, etc. It suggests means we shouldn't worry about this problem at all for the pep 557 dataclasses implementation. Type analyzers by that definition are going to need to deal with incorrect annotations in data classes as a result no matter what so they'll deal with that regardless of how we say dataclasses should work. -gps

On 2017-12-22 00:19, Gregory P. Smith wrote:
If you allowed "a: 10" (an int value), then you might also allow "a: 'foo'" (a string value), but wouldn't that be interpreted as a type called "foo"? If you can't have a string value, then you shouldn't have an int value either. [snip]

On 12/21/2017 7:55 PM, MRAB wrote:
As far as dataclasses are concerned, both of these are allowed, and since neither is ClassVar or InitvVar, they're ignored. Type checkers would object to the int, and I assume also the string unless there was a type foo defined. See https://www.python.org/dev/peps/pep-0484/#the-problem-of-forward-declaration... and typing.get_type_hints(). It's a bug that dataclasses currently does not inspect string annotations to see if they're actually ClassVar or InitVar declarations. PEP 563 makes it critical (and not merely important) to look at the string annotations. Whether or not that involves typing.get_type_hints() or not, I haven't yet decided. I'm waiting for PEPs 563 and 560 to be implemented before taking another look at it. Eric.

On 21 December 2017 at 11:22, Terry Reedy <tjreedy@udel.edu> wrote:
For me, the three options for "don't care" have a bit different meaning: * typing.Any: this class is supposed to be used with static type checkers, but this field is too dynamic * ... (ellipsis): this class may or may not be used with static type checkers, use the inferred type in the latter case * "field docstring": this class should not be used with static type checkers Assuming this, the second option would be the "real" "don't care". If this makes sense, then we can go the way proposed in https://github.com/python/typing/issues/276 and make ellipsis semantics "official" in PEP 484. (pending Guido's approval) -- Ivan

On Thu, Dec 21, 2017, 03:37 Ivan Levkivskyi, <levkivskyi@gmail.com> wrote:
I vote for option 2 as well. I think it's worth reminding people that if they don't like the fact dataclasses (ab)use type hints for their succinct syntax that you can always use attrs instead to avoid type hints. Otherwise whichever approach we agree to from Ivan's suggestions will take care of this. As for those who feel dataclasses will force them to teach type hints and they simply don't want to, maybe we could help land protocols and then maybe you can use dataclasses as an opportunity to explicitly teach duck typing? But I think the key point I want to make is Guido chose dataclasses to support using the type hints syntax specifically over how attrs does things, so I don't see this thread trying to work around that going anywhere at this point since I haven't seen a solid alternative be proposed after all of this debating. -brett

On Fri, Dec 22, 2017 at 8:49 AM, Brett Cannon <brett@python.org> wrote:
sure -- but this doesn't really address the issue, the whole reason this is even a discussion is because dataclasses is going into the standard library. Third party packages can do whatever they want, of course. And the concern is that people (in particular newbies) will get confused / the wrong impression / other-negative-response by the (semi) use of typing in a standard library module.
As for those who feel dataclasses will force them to teach type hints and they simply don't want to, maybe we could help land protocols
Could you please clarify what this is about ???
And the PEP has been approved. So the actionable things are: Writing good docs Converging on a "recommended" way to do non-typed dataclass fields. And that should be decided in order to write the docs, (and probably should be in the PEP). -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On Fri, Dec 22, 2017 at 11:40 AM Chris Barker <chris.barker@noaa.gov> wrote:
On Fri, Dec 22, 2017 at 8:49 AM, Brett Cannon <brett@python.org>
But I think the key point I want to make is Guido chose dataclasses to
My preference for this is "just use Any" for anyone not concerned about the type. But if we wanted to make it more opaque so that people need not realizing that they are actually type annotations, I suggest adding an alias for Any in the dataclasses module (dataclasses.Data = typing.Any) from dataclasses import dataclass, Data @dataclass class Swallow: weight_in_oz: Data = 5 laden: Data = False species: Data = SwallowSpecies.AFRICAN the word "Data" is friendlier than "Any" in this context for people who don't need to care about the typing module. We could go further and have Data not be an alias for Any if desired (so that its repr wouldn't be confusing, not that anyone should be looking at its repr ever). -gps

On 22 December 2017 at 19:50, Gregory P. Smith <greg@krypto.org> wrote:
That sounds like a nice simple proposal. +1 from me. Documentation can say that variables should be annotated with "Data" to be recognised by the decorator, and if people are using type annotations an actual type can be used in place of "Data" (which acts the same as typing.Any. That seems to me to describe the feature in a suitably type-hinting-neutral way, while still making it clear how data classes interact with type annotations. Paul

On 23 Dec. 2017 9:37 am, "David Mertz" <mertz@gnosis.cx> wrote: There name Data seems very intuitive to me without suggesting type declaration as Any does (but it can still be treated as a synonym by actual type checkers) Type checkers would also be free to interpret it as "infer the type from the default value", rather than necessarily treating it as Any. I still wonder about the "fields *must* be annotated" constraint though. I can understand a constraint that the style be *consistent* (i.e. all fields as annotations, or all fields as field instances), since that's needed to determine the field order, but I don't see the problem with the "no annotations" style otherwise. Cheers, Nick.

On Sat, Dec 23, 2017 at 5:54 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
IIUC, without annotations, there is no way to set a field with no default. And supporting both approaches violates "only one way to do it" in, I think, a confusing manner -- particularly if you can't mix and match them. Also, could does using class attributes without annotations make a mess when subclassing? -- no I haven't thought that out yet. -CHB
Cheers, Nick.
-- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On 12/26/17 1:49 PM, Chris Barker wrote:
I have not been following the design of dataclasses, and maybe I'm misunderstanding the state of the work. My impression is that attrs was a thing, and lots of people loved it, so we wanted something like it in the stdlib. Data Classes is that thing, but it is a new thing being designed from scratch. There are still big questions about how it should work, but it is already a part of 3.7. Often when people propose new things, we say, "Put it on PyPI first, and let's see how people like it." Why isn't that the path for Data Classes? Why are they already part of 3.7 when we have no practical experience with them? Wouldn't it be better to let the design mature with real experience? Especially since some of the questions being asked are about how it interrelates with another large new feature with little practical use yet (typing)? --Ned.

On Tue, 26 Dec 2017 at 21:00 Ned Batchelder <ned@nedbatchelder.com> wrote:
Yes.
I wouldn't characterize it as "big questions". For some people there's a question as to how to make them work without type hints, but otherwise how they function is settled.
The short answer: "Guido said so". :) The long answer (based on my understanding, which could be wrong :) : Guido liked the idea of an attrs-like thing in the stdlib, but not attrs itself as Guido was after a different API. Eric V. Smith volunteered to work on a solution, and so Guido, Hynek, and Eric got together and discussed things at PyCon US. A design was hashed out, Eric went away and implemented it, and that led to the current solution. The only thing left is some people don't like type hints and so they don't want a stdlib module that requires them to function (there's no issue with how they relate *to* type hints, just how to make dataclasses work *without* type hints). So right now we are trying to decide what should represent the "don't care" type hint.

Brett Cannon writes:
Recently a question has been raised about the decorator overriding methods defined in the class (especially __repr__). People feel that if the class defines a method, the decorator should not override it. The current API requires passing "repr=false" to the decorator.

On Fri, Dec 29, 2017 at 7:10 AM, Stephen J. Turnbull < turnbull.stephen.fw@u.tsukuba.ac.jp> wrote:
I think this is a reasonable question, though I'm not sure how "big" it is. Note that if the *base* class defines __repr__ the decorator should still override it (unless repr=False), since there's always object.__repr__ (and same for most other dunders). We should also (like we did with most questions big and small) look at what attrs does and why. Regarding whether this should live on PyPI first, in this case that would not be helpful, since attrs is already the category killer on PyPI. So we are IMO taking the best course possible given that we want something in the stdlib but not exactly attrs. -- --Guido van Rossum (python.org/~guido)

On 12/29/17 1:59 PM, Guido van Rossum wrote:
It always seemed to me that the reason to recommend putting something on PyPI first wasn't so that it would climb up some kind of leaderboard, but so that people could get real-world experience with it before freezing it into the stdlib. If we think people won't start using data classes from PyPI, why do we think it's important to get into the stdlib? It still seems to me like there are open questions about how data classes should work. Getting people using it will be a good way to get the best design before our hands are tied with backward compatibility in the stdlib. What is the rush to put a new design into the stdlib? Presumably it is better than attrs (or we would have simply adopted attrs). Having data classes on PyPI will be a good way to gauge acceptance. --Ned.

On 30 December 2017 at 11:48, Ned Batchelder <ned@nedbatchelder.com> wrote:
attrs has already proved the utility of the approach, and the differences between the two (such as they are) are mostly cosmetic (attrs even already has a release out that supports the annotation based syntax). The cosmetic differences matter for educational purposes (i.e. "data classes" with "fields", vs trying to explain that "attributes", "attrs", "attr.s", and "attr.ib" are all different things), but "available by default" matters even more on that front. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 22 December 2017 at 20:55, Brett Cannon <brett@python.org> wrote:
If anyone is curious this is PEP 544. It is actually already fully supported by mypy, so that one can play with it (you will need to also install typing_extensions, where Protocol class lives until the PEP is approved). -- Ivan

On Thu, Dec 21, 2017 at 6:39 AM Ivan Levkivskyi <levkivskyi@gmail.com> wrote:
I am a little nervous about using "..." for inferred types, because it could potentially cause confusion with other uses of ellipsis in typing. Ellipsis already has a special meaning for Tuple, so an annotation like MyClass[int, ...] could mean either a tuple subclass with integer elements or a two argument generic type where the second type is inferred. Actually, it's ambiguous even for Tuple. Ellipsis could also make a lot of sense for typing multi-dimensional arrays similar to how it's used in indexing to denote "any number of dimensions." Again, the semantics for "..." might defer from "an inferred size."

On Fri, Dec 22, 2017 at 10:10 AM, Stephan Hoyer <shoyer@gmail.com> wrote:
On Thu, Dec 21, 2017 at 6:39 AM Ivan Levkivskyi <levkivskyi@gmail.com> wrote:
* ... (ellipsis): this class may or may not be used with static type
checkers, use the inferred type in the latter case
Isn't that what "make ellipsis semantics "official"" means -- i.e. making it clear how they are used in typing? The core problem is that generic annotations are used in dataclasses without the "type hints" use-case. But: 1) Python is moving to make (PEP 484) type hints be THE recommended usage for annotations 2) We want the annotations in dataclasses to be "proper" PEP 484 type hints if they are there. The challenge is: - Annotations require a value. - Any value used might be interpreted by a static type checker. So we need a way to spell "no type specified" that will not be mis-interpreted by type checkers, and is in the built in namespace, and will seem natural to users with no knowledge or interest in static typing. The ellipses is tempting, because it's a literal that doesn't have any other obvious meaning in this context. Bu tif it has an incompatible meaning in PEP 484, then we're stuck. Is there another Obscure literal that would work? - I assume None means "the None type" to type checkers, yes? - empty string is one option -- or more to the point, any string -- so then it could be used as docs as well. - Is there another Obscure literal that would work? (or not so obscure one that doesn't have another meaning to type checkers) Would it be crazy to bring typing.Any into the builtin namespace? @dataclass: a: Any b: Any = 34 c: int = 0 That reads pretty well to me.... And having Any available in the built in namespace may help in other cases where type hints are getting introduced into code that isn't really being properly type checked. I don't LOVE it -- to me, Any means "any type will do", or "I don't care what type this is" and what we really want is "no type specified" -- i.e. the same thing as plain old Python code without type hints. But practically speaking, it has the same effect, yes? -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On 2017-12-22 12:15, Chris Barker wrote:
There is already an "any" function in the builtins. It looks fine but not sure how it will interact with type checkers. The "dataclass.Data" idea mentioned in a sibling thread is good alternative, though just wordy enough to make ... a shortcut. -Mike

On Fri, Dec 22, 2017 at 1:18 PM, MRAB <python@mrabarnett.plus.com> wrote:
The function is "any", the type is "Any", and "any" != "Any", although I wonder how many people will be caught out by that...
enough that it's a bad idea.... oh well. -CHB
-- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On 12/21/2017 6:36 AM, Ivan Levkivskyi wrote:
In https://github.com/ericvsmith/dataclasses/issues/2#issuecomment-353918024, Guido has suggested using `object`, which has the benefit of not needing an import. And to me, it communicates the "don't care" aspect well enough. I do understand the difference if you're using a type checker (see for example https://stackoverflow.com/questions/39817081/typing-any-vs-object), but if you care about that, use typing.Any. Eric.

On Mon, Dec 18, 2017 at 11:49 PM, Eric V. Smith <eric@trueblade.com> wrote:
Sure -- but that's documentation of type annotations -- someone uninterested in typing, or completely unaware of it, will not be reading those docs.
Data Classes is also not the first use of type annotations in the stdlib: https://docs.python.org/3/library/typing.html#typing.NamedTuple
That's in the typing package, yes? collections.namedtuple is unchanged. So yes, obviously the entire typing package is about typing. This is something that has nothing to do with typing, but does use the typing syntax. It really is different. I haven't started teaching typing to newbies yet -- but I imagine I will have to some day -- and when I do, it will be in the context of: here is an optional feature that you can use along with a static type checker. And I can make it clear that the annotations only apply to the static type checker, and not run-time behavior. But using type annotations for something other than providing information to a static type checker, in an stdlib module, changes that introduction. And people don't read all the docs -- they read to the first example of how to use it, and away they go. And if that example is something like: @dataclass class C: a: int b: float = 0.0 There WILL be confusion. Paul Moore wrote:
That suggests to me that the people involved in discussing the PEP may not be representative of the bulk of Python users. There are a number of us that are uncomfortable with static typing in general, and the python-dev community has been criticised for doing too much, moving too fast, and complicating the language unnecessarily. The PEP's been accepted, so let's move forward, but please be aware of these issues with the documentation and examples. I'll try to contribute to that discussion as well. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On 19 Dec. 2017 7:00 am, "Chris Barker" <chris.barker@noaa.gov> wrote: Are there other options?? plain old: @dataclass class C: a = 1 b = 1.0 would work, though then there would be no way to express fields without defaults: The PEP already supports using "a = field(); b = field()" (etc) to declare untyped fields without a default value. This annotation free spelling may not be clearly covered in the current module docs, though. Cheers, Nick.

On 18 December 2017 at 20:38, Nick Coghlan <ncoghlan@gmail.com> wrote:
The PEP is not 100% clear not this, but it is currently not the case and this may be intentional (one obvious way to do it), I just tried and this does not work: @dataclass class C: x = field() generates `__init__` etc. with no arguments. I think however that it is better to generate an error than silently ignore it. (Or if this a bug in the implementation, it should be just fixed.) -- Ivan

On 12/18/2017 2:55 PM, Ivan Levkivskyi wrote:
Hmm, not sure why that doesn't generate an error. I think it's a bug that should be fixed. Or, we could make the same change we're making in make_dataclass(), where we'll use "typing.Any" (as a string) if the type is omitted. See https://bugs.python.org/issue32278.

On 11 Dec. 2017 12:26 pm, "Eric V. Smith" <eric@trueblade.com> wrote: I see a couple of options: 1a: Use a default type annotation, if one is not is supplied. typing.Any would presumably make the most sense. 1b: Use None if not type is supplied. 2: Rework the code to not require annotations at all. 1c: annotate with the string "typing.Any" (this may require a tweak to the rules for evaluating lazy annotations, though) Cheers, Nick.

On 12/10/2017 5:00 PM, Raymond Hettinger wrote:
This is bpo-32278.
This is bpo-32279.
2) Change the default value for "hash" from "None" to "False". This might take a little effort because there is currently an oddity where setting hash=False causes it to be hashable. I'm pretty sure this wasn't intended ;-)
No time for this one yet. Soon! Eric.

On 12/10/2017 5:00 PM, Raymond Hettinger wrote:
I've checked this under bpo-32278.
And I've checked this in under bpo-32279.
2) Change the default value for "hash" from "None" to "False". This might take a little effort because there is currently an oddity where setting hash=False causes it to be hashable. I'm pretty sure this wasn't intended ;-)
I haven't looked at this yet. Eric.

On 1/6/2018 5:13 PM, Eric V. Smith wrote:
On 12/10/2017 5:00 PM, Raymond Hettinger wrote:
...
I think the hashing logic explained in https://bugs.python.org/issue32513#msg310830 is correct. It uses hash=None as the default, so that frozen=True objects are hashable, which they would not be if hash=False were the default. If there's some case there that you disagree with, I'd be interested in hearing about it. That logic is what is currently scheduled to go in to 3.7 beta 1. I have not updated the PEP yet, mostly because it's so difficult to explain. What's the case where setting hash=False causes it to be hashable? I don't think that was ever the case, and I hope it's not the case now. Eric

Wouldn't it be simpler to make the options orthogonal? Frozen need not imply hashable. I would think if a user wants frozen and hashable, they could just write frozen=True and hashable=True. That would more explicit and clear than just having frozen=True imply that hashability gets turned-on implicitly whether you want it or not.
If there's some case there that you disagree with, I'd be interested in hearing about it.
That logic is what is currently scheduled to go in to 3.7 beta 1. I have not updated the PEP yet, mostly because it's so difficult to explain.
That might be a strong hint that this part of the API needs to be simplified :-) "If the implementation is hard to explain, it's a bad idea." -- Zen If for some reason, dataclasses really do need tri-state logic, it may be better off with enum values (NOT_HASHABLE, VALUE_HASHABLE, IDENTITY_HASHABLE, HASHABLE_IF_FROZEN or some such) rather than with None, True, and False which don't communicate enough information to understand what the decorator is doing.
What's the case where setting hash=False causes it to be hashable? I don't think that was ever the case, and I hope it's not the case now.
Python 3.7.0a4+ (heads/master:631fd38dbf, Jan 28 2018, 16:20:11) [GCC 7.2.0] on darwin Type "copyright", "credits" or "license()" for more information.
hash(A(1)) 285969507
I'm hoping that this part of the API gets thought through before it gets set in stone. Since dataclasses code never got a chance to live in the wild (on PyPI or some such), it behooves us to think through all the usability issues. To me at least, the tri-state hashability was entirely unexpected and hard to debug -- I had to do a close reading of the source to figure-out what was happening. Raymond

I think this is a good candidate for fine-tuning during the beta period. Though honestly Python's own rules for when a class is hashable or not are the root cause for the complexity here -- since we decided to implicitly set __hash__ = None when you define __eq__, it's hardly surprising that dataclasses are having a hard time making natural rules. On Sun, Jan 28, 2018 at 5:07 PM, Raymond Hettinger < raymond.hettinger@gmail.com> wrote:
-- --Guido van Rossum (python.org/~guido)

On 29 January 2018 at 12:08, Guido van Rossum <guido@python.org> wrote:
In Raymond's example, the problem is the opposite: data classes are currently interpreting "hash=False" as "Don't add a __hash__ implementation" rather than "Make this unhashable". That interpretation isn't equivalent due to object.__hash__ existing by default. (Reviewing Eric's table again, I believe this problem still exists in the 3.7b1 variant as well - I just missed it the first time I read that) I'd say the major argument in favour of Raymond's suggestion (i.e. always requiring an explicit "hash=True" in the dataclass decorator call if you want the result to be hashable) is that even if we *do* come up with a completely consistent derivation rule that the decorator can follow, most *readers* aren't going to know that rule. It would become a Python gotcha question for tech interviews: ============= Which of the following class definitions are hashable and what is their hash based on?: @dataclass class A: field: int @dataclass(eq=False) class B: field: int @dataclass(frozen=True) class C: field: int @dataclass(eq=False, frozen=True) class D: field: int @dataclass(eq=True, frozen=True) class E: field: int @dataclass(hash=True) class F: field: int @dataclass(frozen=True, hash=True) class G: field: int @dataclass(eq=True, frozen=True, hash=True) class H: field: int ============= Currently the answers are: - A: not hashable - B: hashable (by identity) # Wat? - C: hashable (by field hash) - D: hashable (by identity) # Wat? - E: hashable (by field hash) - F: hashable (by field hash) - G: hashable (by field hash) - H: hashable (by field hash) If we instead make the default "hash=False" (and interpret that as meaning "Inject __hash__=None"), then you end up with the following much simpler outcome that can be mapped directly to the decorator "hash" parameter: - A: not hashable - B: not hashable - C: not hashable - D: not hashable - E: not hashable - F: hashable (by field hash) - G: hashable (by field hash) - H: hashable (by field hash) Inheritance of __hash__ could then be made explicitly opt-in by way of a "dataclasses.INHERIT" constant. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 1/29/2018 1:55 AM, Yury Selivanov wrote:
I agree it's complicated. I think it would be a bad design to have to opt-in to hashability if using frozen=True. The point of hash=None (the default) is to try and get the simple cases right with the simplest possible interface. It's the intersection of "have simple defaults, but ways to override them" with "if the user provides some dunder methods, don't make them specify feature=False in order to use them" that complicated things. For example, maybe we no longer need eq=False now that specifying a __eq__ turns off dataclasses's __eq__ generation. Does dataclasses really need a way of using object identity for equality? Eric.

On Jan 28, 2018, at 11:52 PM, Eric V. Smith <eric@trueblade.com> wrote:
I think it would be a bad design to have to opt-in to hashability if using frozen=True.
I respect that you see it that way, but it doesn't make sense to me. You can have either one without the other. It seems to me that it is clearer and more explicit to just say what you want rather than having implicit logic guess at what you meant. Otherwise, when something goes wrong, it is difficult to debug. The tooltips for the dataclass decorator are essentially of checklist of features that can be turned on or off. That list of features is mostly easy-to-use except for hash=None which has three possible values, only one of which is self-evident. We haven't had much in the way of user testing, so it is a significant data point that one of your first users (me) found was confounded by this API. I recommend putting various correct and incorrect examples in front of other users (preferably experienced Python programmers) and asking them to predict what the code does based on the source code. Raymond

On 1/29/2018 4:01 AM, Raymond Hettinger wrote:
I certainly respect your insights.
The tooltips for the dataclass decorator are essentially of checklist of features that can be turned on or off. That list of features is mostly easy-to-use except for hash=None which has three possible values, only one of which is self-evident.
Which is the one that's self-evident? I would think hash=False, correct? The problem is that for repr=, eq=, compare=, you're saying "do or don't add this/these methods, or if true, don't even add it if it's already defined". The same is true for hash=True/False, with the complication of the implicit __hash__ that's added by __eq__. In addition to "do or don't add __hash__", there needs to be a way of setting __hash__=None. The processing of hash=None is trying to guess what sort of __hash__ you want: not set it and just inherit it, generate it based on fields, or set it to None. And if it guesses wrong, based on the fairly simple hash=None rules, you can control it with other values of hash=. Maybe that's the problem. I'm open to ways to express these options. Again, I think losing "do the right thing most of the time without explicitly setting hash=" would be a shame, but not the end of the world. And changing it to "hashable=" isn't quite as simple as it seems, since there's more than one definition of hashable: identity-based or field-based.
We haven't had much in the way of user testing, so it is a significant data point that one of your first users (me) found was confounded by this API. I recommend putting various correct and incorrect examples in front of other users (preferably experienced Python programmers) and asking them to predict what the code does based on the source code.
I agree it's sub-optimal, but it's a complex issue. What would the interface look like that allowed a programmer to know if an object was hashable based on object identity versus field values? Eric.

I don't think we're going to reach full agreement here, so I'm going to put my weight behind Eric's rules. I think the benefit of the complicated rules is that they almost always do what you want, so you almost never have to think about it. If it doesn't do what you want, setting hash=False or hash=True is much quicker than trying to understand the rules. But the rules *are* deterministic and reasonable. -- --Guido van Rossum (python.org/~guido)

On 1/29/2018 3:42 AM, Ethan Furman wrote:
It means "don't add a __hash__ attribute, and rely on the base class value". But maybe it should mean "is not hashable". But in that case, how would we specify the "don't add __hash__" case? Note that "repr=False" means "don't add a __repr__", not "is not repr-able". And "init=False" means "don't add a __init__", not "is not init-able". Eric.

On 01/29/2018 12:57 AM, Eric V. Smith wrote:
On 1/29/2018 3:42 AM, Ethan Furman wrote:
On 01/28/2018 07:45 AM, Eric V. Smith wrote:
I thought `hash=False` means don't add a __hash__ method..
Note that "repr=False" means "don't add a __repr__", not "is not repr-able". And "init=False" means "don't add a __init__", not "is not init-able".
Yeah, like that. I get that the default for all (or at least most) of the boring stuff should be "just do it", but I don't think None is the proper place-holder for that. Why not make an `_default = object()` sentinel and use that for the default? At least for __hash__. Then we have: hash=False -> don't add one hash=None -> add `__hash__ = None` (is not hashable) hash=True -> add one (the default... Okay, after writing that down, why don't we have the default value for anything automatically added be True? With True meaning the dataclass should have a custom whatever, and if the programmer did not provide one the decorator will -- it can even be a self-check: if the parameters in the decorator are at odds with the actual class contents (hash=None, but the class has a __hash__ method) then an exception could be raised. -- ~Ethan~

Thanks Eric and Ivan. You're both very responsive. I appreciate the enormous efforts you're putting in to getting this right. I suggest two other fix-ups: 1) Let make_dataclass() pass through keyword arguments to _process_class(), so that this will work: Point = make_dataclass('Point', ['x', 'y', 'z'], order=True) 2) Change the default value for "hash" from "None" to "False". This might take a little effort because there is currently an oddity where setting hash=False causes it to be hashable. I'm pretty sure this wasn't intended ;-) Raymond

On 12/10/2017 5:00 PM, Raymond Hettinger wrote:
Thank you for your feedback. It's very helpful. I see a couple of options: 1a: Use a default type annotation, if one is not is supplied. typing.Any would presumably make the most sense. 1b: Use None if not type is supplied. 2: Rework the code to not require annotations at all. I think I'd prefer 1a, since it's easy. However, typing is not currently imported by dataclasses.py. There's an argument that it really needs to be, and I should just bite the bullet and live with it. Possibly with Ivan's PEP 560 work my concern on importing typing goes away. 1b would be easy, but I don't like using non-types for annotations. 2 would be okay, but then that would be the only time __annotations__ wouldn't be set on a dataclass.
Agreed.
2) Change the default value for "hash" from "None" to "False". This might take a little effort because there is currently an oddity where setting hash=False causes it to be hashable. I'm pretty sure this wasn't intended ;-)
It's sufficiently confusing that I need to sit down when I have some free time and noodle this through. But it's still on my radar. Eric.

. I see a couple of options: 1a: Use a default type annotation, if one is not is supplied. typing.Any would presumably make the most sense. 1b: Use None if not type is supplied. 2: Rework the code to not require annotations at all. I think I'd prefer 1a, since it's easy. 2) would be great :-) I find this bit of “typing creep” makes me nervous— Typing should Never be required! I understand that the intent here is that the user could ignore typing and have it all still work. But I’d rather is was not still there under the hood. Just because standardized way to do something is included in core Python doesn’t mean the standard library has to use it. However, typing is not currently imported by dataclasses.py. And there you have an actual reason besides my uneasiness :-) - CHB

I'm not sure what Mail User Agent each of you is using, but it is quite impossible (here) to make out who is saying what in your latest messages. See plain text rendering here: https://mail.python.org/pipermail/python-dev/2017-December/151274.html Regards Antoine. On Fri, 15 Dec 2017 10:56:28 +0000 Steve Holden <steve@holdenweb.com> wrote:

On 12/15/2017 5:56 AM, Steve Holden wrote:
[Agreed with Antoine on the MUA and quoting being confusing.] The only reason typing isn't imported is performance. I hope that once PEP 560 is complete this will no longer be an issue, and dataclasses will always import typing. But of course typing will still not be needed for most uses of @dataclass or make_dataclass(). This is explained in the PEP. Eric.

Sorry about the email mangling -- I do a lot of my listserve work on the bus on an iPhone, with the built -in mail client -- and it REALLY sucks for doing interspersed email replying -- highly encouraging the dreaded top posting... But anyway, I think both Steve and I were expressing concerns about "Typing Creep". Typing should always be optional in Python, and while this PEP does keep it optional, Steve's point was that the code in the standard library serves not only as a library, but as examples of how to write "robust" python code. The rest of this note is me -- I'm not pretending ot speak for Steve. Reading the PEP, this text makes me uneasy: "A field is defined as any variable identified in__annotations__. That is, a variable that has a type annotation." And if I understand the rest of the PEP, while typing itself is optional, the use of type Annotation is not -- it is exactly what's being used to generate the fields the user wants. And the examples are all using typing -- granted, primarily the built in types, but still: @dataclass class C: a: int # 'a' has no default value b: int = 0 # assign a default value for 'b' This sure LOOKS like typing is required. It also makes me nervous because, as I understand it, the types aren't actually used in the implementation (presumable they would be by mypy and the like?). So I think for folks that aren't using typing and a type checker in their development process, it would be pretty confusing that this means and what it actually does. Particularly folks that are coming from a background of a statically typed language. Then I see: """ Field objects describe each defined field. ... Its documented attributes are: name: The name of the field. type: The type of the field. ... """ So again, typing looks to be pretty baked in to the whole concept. and then: """ One place where dataclass actually inspects the type of a field is to determine if a field is a class variable as defined in PEP 526. """ and """ The other place where dataclass inspects a type annotation is to determine if a field is an init-only variable. It does this by seeing if the type of a field is of type dataclasses.InitVar. """ """ Data Classes will raise a TypeError if it detects a default parameter of type list, dict, or set. """ So: it seems that type hinting, while not required to use Data Classes, is very much baked into the implementation an examples. As I said -- this makes me uneasy -- It's a very big step that essentially promotes the type hinting to a new place in Python -- you will not be able to use a standard library class without at least a little thought about types and typing. I note this: """ This discussion started on python-ideas [9] and was moved to a GitHub repo [10] for further discussion. As part of this discussion, we made the decision to use PEP 526 syntax to drive the discovery of fields. """ I confess I only vaguely followed that discussion -- in fact, mostly I thought that the concept of Data Classes was a good one, and was glad to see SOMETHING get implemented, and didn't think I had much to contribute to the details of how it was done. So these issues may have already been raised and considered, so carry on. But: NOTE: from PEP 526: "Python will remain a dynamically typed language, and the authors have no desire to ever make type hints mandatory, even by convention. " The Data Classes implementation is not making it mandatory by any means, but it is making it a more "standard" part of the language that can not simply be ignored anymore. And it seems some features of dataclasses can only be accessed via actual typing, in addition to the requirement of type annotations. If nothing else, the documentation should make it very clear that the typing aspects of Data Classes is indeed optional, and preferably give some untyped examples, something like: @dataclass class C: a: None # 'a' has no default value b: None = 0 # assign a default value for 'b' If, in fact, that would be the way to do it. -Chris On Fri, Dec 15, 2017 at 3:22 AM, Eric V. Smith <eric@trueblade.com> wrote:
-- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

One other note (see my last message). The PEP should include a summary of the discussion of the decision to use the type annotation syntax vs other options. I just looked through all the gitHub issues and found nothing, and started to look at the python-ideas list archive and got overwhelmed. So having that justification in the PEP would be good. -CHB On Fri, Dec 15, 2017 at 12:07 PM, Chris Barker <chris.barker@noaa.gov> wrote:
-- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On 15 December 2017 at 20:07, Chris Barker <chris.barker@noaa.gov> wrote:
I actually don't have any problem with this. It looks natural to me, reads perfectly fine, and is a far better way of defining fields than many of the other approaches that I've seen in the past (that don't use annotations). The one thing I would find surprising is that the actual type used is ignored. @dataclass class C: a: str = 0 AIUI this is valid, but it looks weird to me. There's an easy answer, though - just don't do that.
Well, being able to see the type the class author intended is a feature. I don't know I'd consider that as meaning typing is "baked in". It's useful but ignorable data.
Those are somewhat more explicit cases of directly using type annotations as declarations. But what alternative would you propose? It still seems fine to me.
Doesn't that mean that @dataclass class C: a: int = [] raises an error? The problem here is the same as that of mutable function default parameters - we don't want every instance of C to share a single list object as their default value for a. It's got nothing to do with the annotation (that's why I used the deliberately-inconsistent annotation of int here). I'm a strong +1 on making this an error, as it's likely to be an easy mistake to make, and quite hard to debug.
So: it seems that type hinting, while not required to use Data Classes, is very much baked into the implementation an examples.
Annotations and the annotation syntax are fundamental to the design. But that's core Python syntax. But I wouldn't describe types as being that significant to the design, it's more "if you supply them we'll make use of them". Don't forget, function parameter annotations were around long before typing. Variable annotations weren't, but they could have been - it's just that typing exposed a use case for them. Data classes could just as easily have been the motivating use case for PEP 526.
I will say that while I don't use typing or mypy at all in my code, I don't have any particular dislike of the idea of typing, or the syntax for declaring annotations. So I find it hard to understand your concerns here. My personal uneasiness is actually somewhat the opposite - I find it disconcerting that if I annotate a variable/parameter as having type int, nothing stops me assigning a string to it. But that's *precisely* what typing being optional means, so while it seems odd to my static typing instincts, it's entirely within the spirit of not forcing typing onto Python.
This does seem like a reasonable option to note. Something along the lines of "If you don't use type annotations in your code, and you want to avoid introducing them, using None as a placeholder for the type is sufficient". However, I suspect that using None as a "I don't really want to assign a type" value might well confuse mypy - I don't know. But using typing.Any (which is what mypy would expect) clearly doesn't meet the "avoid typing totally" requirement here. Maybe (mis-)using string annotations, like @dataclass class C: a: 'variable' # 'a' has no default value b: 'variable' = 0 # assign a default value for 'b' would work? But honestly, this feels like jumping through hoops purely to avoid using int "because it means I've bought into the idea of typing". I guess if you're that adamant about never wanting to use typing in your code, data classes would make you uncomfortable. But conversely, I don't see the value in making data classes clumsier than they need to be out of a purist principle to not use a perfectly valid Python syntax. Paul

On Sun, Dec 17, 2017 at 8:22 AM, Guido van Rossum <guido@python.org> wrote:
Mypy definitely won't like that use of annotation, but documentation systems might. For example, in a hover tooltip in an IDE/editor, it's probably more helpful to see the descriptive message than "int" or "float" for the attribute. What about data that isn't built-in scalars? Does this look right to people (and will mypy be happy with it)? @dataclass class C: a:numpy.ndarray = numpy.random.random((3,3)) b:MyCustomClass = MyCustomClass("foo", 37.2, 1+2j) I don't think those look terrible, but I think this looks better: @dataclass class C: a:Infer = np.random.random((3,3)) b:Infer = MyCustomClass("foo", 37.2, 1+2j) Where the name 'Infer' (or some other spelling) was a name defined in the `dataclasses` module. In this case, I don't want to use `typing.Any` since I really do want "the type of thing the default value has." -- Keeping medicines from the bloodstreams of the sick; food from the bellies of the hungry; books from the hands of the uneducated; technology from the underdeveloped; and putting advocates of freedom in prisons. Intellectual property is to the 21st century what the slave trade was to the 16th.

Good Bad or Neutral, this discussion makes my point: Using typing annotation as a necessary part of a standard library module is injecting typing into "ordinary" python in a new way. It is no longer going to appear to be completely optional, and only of concern to those that choose to use it (and mypy or similar). And I do think it is really bad UI to have something like: @dataclass class C: a: Int = 1 b: float = 1.0 be the recommended (and shown in all the examples, and really be almost the only way) to define a dataclass, when the type will in fact be completely ignored by the implementation. Newbies are going to be confused by this -- they really are. Anyway, clearly I personally don't think this is a very good idea, but I see that annotations are a natural and easy way to express the fields without adding any new syntax. But most importantly I don't think this should become standard without consideration of the impact and a deliberate decision to do so. A note: I don't know who everyone is that was engaged in the gitHub discussion working out the details, but at least a few core folks are very engaged in the introduction of type hinting to Python in general -- so I think a certain perspective may have been over-represented. Are there other options?? plain old: @dataclass class C: a = 1 b = 1.0 would work, though then there would be no way to express fields without defaults: @dataclass class C: a = 1 b = None almost -- but they is there "no default" or is the default None Would it be impossible to use the annotation syntax, but with the type optional: @dataclass class C: a : = 1 # filed with default value b : # field with no default This is not legal python now, but are there barriers other than not wanting to make yet more changes to it being legal (i.e. hard/impossible to unambiguously parse, etc. Maybe this can all be addresses by more "Untyped" examples the docs. -CHB On Sun, Dec 17, 2017 at 8:54 AM, David Mertz <mertz@gnosis.cx> wrote:
-- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

@David What you propose as `Infer` annotation was proposed some time ago (not only for dataclasses, there are other use cases). The discussion is here https://github.com/python/typing/issues/276 @Chris People are still allowed not to use dataclasses if they really don't like type hints :-) Seriously however, annotations are just syntax. In this sense PEP 526 is more like PEP 3107, and less like PEP 484. People are still free to write: @dataclass class C: x: "first coordinate" y: "second coordinate" plus: "I don't like types" or @dataclass class C: x: ... y: ... I don't see so big difference between hypothesis (testing lib) using annotations for their purposes from the situation with dataclasses. It is true that the syntax was chosen to simplify support in static type checkers (partially because users were often asking for such feature), but not more than this. If you don't use type checkers, there is no problem in using one of the above forms. If you have ideas about how to improve the dataclass docs, this can be discussed in the issue https://bugs.python.org/issue32216
... the type will in fact be completely ignored by the implementation. Newbies are going to be confused by this -- they really are.
This is not different from def f(x: int): pass f("What") # OK that exists starting from Python 3.0. Although I agree this is confusing, the way forward could be just explaining this better in the docs. If you want my personal opinion about the current situation about type hints _in general_, then I can say that I have seen many cases where people use type hints where they are not needed (for example in 10 line scripts or in highly polymorphic functions), so I agree that some community style guidance (like PEP 8) may be helpful. I had started such project an the end of last year (it was called pep-555, but I didn't have time to work on this and this number is already taken). -- Ivan

I'm really surprised no one seems to get my point here. TL;DR: My point is that having type annotation syntax required for something in the stdlib is a significant step toward "normalizing" type hinting in Python. Whether that's a good idea or not is a judgement call, but it IS a big step. @Chris
Well, yes, of course, but this is not like PEP 3107, as it introduces a requirement for annotations (maybe not *type* annotations per se) in the std lib. Again, that may be the best way to go -- but it should be done deliberately. @dataclass
Ah! I had no idea you could use ellipses to indicate no type. That actually helps a lot. We really should have that prominent in the docs. And in the dataclass docs, not just the type hinting docs -- again, people will want to use these that may not have any interest in nor prior knowledge of type hints.
The big difference is that hypothesis is not in the standard library. Also, I didn't know about hypothesis until just now, but their very first example in the quick start does not use annotation syntax, so it's not as baked in as it is with dataclasses.
If you have ideas about how to improve the dataclass docs, this can be discussed in the issue https://bugs.python.org/issue32216
I'll try to find time to contribute there -- though maybe better to have the doc draft in gitHub?
Again the difference is that EVERY introduction to defining python functions doesn't use the type hint. And even more to the point, you CAN define a function without any annotations. But frankly, I think as type hinting becomes more common, we're going to see a lot of confusion :-( If you want my personal opinion about the current situation about type
It's going to get worse before it gets better :-( @dataclass
class C: x = field()
that does require that `field` be imported, so not as nice. I kinda like the ellipses better. but good to have a way. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On Dec 18, 2017, at 21:41, Chris Barker <chris.barker@noaa.gov> wrote:
TL;DR: My point is that having type annotation syntax required for something in the stdlib is a significant step toward "normalizing" type hinting in Python. Whether that's a good idea or not is a judgement call, but it IS a big step.
This is something we’re discussing for importlib.resources: https://bugs.python.org/issue32248#msg308495 In the standalone version, we’re using annotations for the Python 3 bits. It would make our lives easier if we kept them for the stdlib version (applying diffs and keeping them in sync would be easier). Brett says in the follow up: "As for the type hints, I thought it was lifted such that new code could include it but we wouldn't be taking PRs to add them to pre-existing code?” So, what’s the deal? -Barry

On 12/18/2017 9:41 PM, Chris Barker wrote:
I get your point, I'm just not concerned about it. I also don't think it's surprising that you can put misleading information (including non-types) in type annotations. All of the documentation and discussions are quite clear that type information is ignored at runtime. It _is_ true that @dataclass does actually inspect the type at runtime, but those uses are very rare. And if you do need them, the actual type T used by ClassVar[T] and InitVar[T] are still ignored. Data Classes is also not the first use of type annotations in the stdlib: https://docs.python.org/3/library/typing.html#typing.NamedTuple When I say that "typing is optional", I mean importing the typing module, not that annotations are optional. Eric.

On 19 December 2017 at 07:49, Eric V. Smith <eric@trueblade.com> wrote:
Data Classes is also not the first use of type annotations in the stdlib: https://docs.python.org/3/library/typing.html#typing.NamedTuple
Also, the fact that no-one raised this issue during the whole time the PEP was being discussed (at least as far as I recollect) and that Guido (who of all of us should be most aware of what is and isn't acceptable use of annotations in the stdlib) approved the PEP, suggests to me that this isn't that big a deal. The only thing that has surprised me in this discussion is that the actual type used in the annotation makes no difference. And once someone reminded me that types are never enforced at runtime (you can call f(x: int) with f('haha')) that seemed fine. Paul

On Tue, Dec 19, 2017 at 10:53 AM, Paul Moore <p.f.moore@gmail.com> wrote:
If anything, this makes things more difficult for the learner. The fact that annotations are formally undefined as to anything but syntax is sensible but can be misleading (as the example above clearly shows). In the typing module it's logical to see annotations, I guess. But I really hope they aren't sprinkled around willy-nilly. Sooner or later there will be significant demand for annotated libraries, even though CPython will perform exactly as it does with non-annotated code. I can see the value of annotations in other environments and for different purposes, but it would be a pity if this were to unnecessarily complicate the stdlib. regards Steve

On 2017-12-19 02:53, Paul Moore wrote:
Hi, I asked about this in the first posting of the PEP and agree with Chris. https://mail.python.org/pipermail/python-dev/2017-September/149406.html There is definitely a passive bias towards using types with dataclasses in that the Eric (the author) doesn't appear to want an example without them in the pep/docs. It seems that typing proponents are sufficiently enamored with them that they can't imagine anyone else feeling differently, haha. Personally, I wouldn't use types with Python unless I was leading a large project with a large team of folks with different levels of experience. That's where types shine, and those folks might be better served by Java or Kotlin. So we hearing that "types are optional" while the docs may imply the opposite. Liked the ellipsis since None is often used as a sentinel value and an extra import is a drag. -Mike

On 12/20/2017 6:57 PM, Mike Miller wrote:
I'm not sure what such an example would look like. Do you mean without annotations? Or do you mean without specifying the "correct" type, like: @dataclass class C: x: int = 'hello world' ? Or something else? Can you provide an example of what you'd like to see?
It seems that typing proponents are sufficiently enamored with them that they can't imagine anyone else feeling differently, haha.
I've never used typing or mypy, so you're not talking about me. I do like the conciseness that annotations bring to dataclasses, though. If you buy that (and you might not), then I don't see the point of not using a correct type annotation. Eric.

On 12/20/2017 8:13 PM, Eric V. Smith wrote:
Re-reading my post you referenced, is it just an example using typing.Any? I'm okay with that in the docs, I just didn't want to focus on it in the PEP. I want the PEP to only have the one reference to typing, for typing.ClassVar. I figure the people reading the PEP can extrapolate to all of the possible uses for annotations that they don't need to see a typing.Any example. Eric.

On Wed, Dec 20, 2017 at 5:29 PM, Eric V. Smith <eric@trueblade.com> wrote:
IIUC, there is not way to make a dataclass without annotations, yes? That it, using annotations to determine the fields is the one and only way the decorator works. So it's impossible to give an example without annotations, yes?
It may be a good idea to have an example like that in the docs (but probably not the PEP) to make it clear that the type is not used in any way at run time. But I don't think that anyone is suggesting that would be a recommended practice. I suggest that it be clear in the docs, and ideally in the PEP, that the dataclass decorator is using the *annotation" syntax, and that the the only relevant part it uses is that an annotation exists, but the value of the annotation is essentially (completely?) ignored. So we should have examples like: @dataclass class C: a: ... # field with no default b: ... = 0 # filed with a default value Then maybe: @dataclass class C: a: "the a parameter" # field with no default b: "another, different parameter" = 0.0 # field with a default Then the docs can go to say that if the user wants to specify a type for use with a static type checking pre-processor, they can do it like so: @dataclass class C: a: int # integer field with no default b: float = 0.0 # float field with a default And the types will be recognized by type checkers such as mypy. And I think the non-typed examples should go first in the docs. This is completely analogous to how all the other parts of python are taught. Would anyone suggest that the very first example of a function definition that a newbie sees would be: def func(a: int, b:float = 0.0): body_of_function Then, _maybe_ way down on the page, you mention that oh, by the way, those types are completely ignored by Python. And not even give any examples without types?
Re-reading my post you referenced, is it just an example using typing.Any?
I actually think that is exactly the wrong point -- typing.Any is still using type hinting -- it's an explicit way to say, "any type will do", but it's only relevant if you are using a type checker. We really need examples for folks that don't know or care about type hinting at all. typing.Any is for use by people that are explicitly adding type hinting, and should be discussed in type hinting documentation. people reading the PEP can
no they don't, but they DO need to see examples without type hints at all. -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On 12/21/2017 1:46 AM, Chris Barker wrote:
Correct. Well, you will be able to use make_dataclass() without type information after I fix bpo-32278, but most users won't be using that.
I think the PEP is very clear about this: "The dataclass decorator examines the class to find fields. A field is defined as any variable identified in __annotations__. That is, a variable that has a type annotation. With two exceptions described below, none of the Data Class machinery examines the type specified in the annotation." I agree the docs should also be clear about this.
I'll leave this for others to decide. The docs, and how approachable they are to various audiences, isn't my area of expertise.
I'm not opposed to this in the documentation. Maybe we should decide on a convention on what to use to convey "don't care". I've seen typing.Any, None, ellipsis, strings, etc. all used. Eric.

On 12/21/2017 4:22 AM, Eric V. Smith wrote:
On 12/21/2017 1:46 AM, Chris Barker wrote:
This seems clear enough. It could come after describing what a dataclass *is*.
I agree the docs should also be clear about this.
Module some bike-shedding, the above seems pretty good to me.
I'll leave this for others to decide. The docs, and how approachable they are to various audiences, isn't my area of expertise.
-- Terry Jan Reedy

On 12/21/17 6:25 AM, Sven R. Kunze wrote:
Because you can't know the order that x and y are defined in this example: class C: x: int y = 0 'x' is not in C.__dict__, and 'y' is not in C.__annotations__. Someone will suggest a metaclass, but that has its own problems. Mainly, interfering with other metaclasses. Eric.

On 12/21/2017 9:23 AM, Eric V. Smith wrote:
I think the understanding problem with this feature arises from two factors: using annotations to define possibly un-initialized slots is non-obvious; a new use of annotations for something other than static typing is a bit of a reversal of the recent pronouncement 'annotations should only be used for static typing'. Therefore, getting the permanent doc 'right' is important. The following naively plausible alternative does not work and cannot sensibly be made to work because the bare 'x' in the class scope, as opposed to a similar error within a method, causes NameError before the class is created. @dataclass class C: x y = 0 I think the doc should explicitly say that uninitialized fields require annotation with something (anything, not necessarily a type) simply to avoid NameError during class creation. It may not be obvious to some readers why x:'anything' does not also raise NameError, but that was a different PEP, and the dataclass doc could here link to wherever name:annotation in bodies is explained. -- Terry Jan Reedy

On Thu, Dec 21, 2017 at 7:55 PM, Terry Reedy <tjreedy@udel.edu> wrote:
Solely because, annotations being optional, the interpreter is not allowed to infer from its presence that an annotated name should be allocated an entry in __dict__, and clearly the value associated with it would be problematical. I think the understanding problem with this feature arises from two
Indeed. So annotations are optional, except where they aren't?
This contortion is why I feel a better solution would be desirable. Alas I do not have one to hand. regards Steve

On Thu, Dec 21, 2017 at 11:55 AM, Terry Reedy <tjreedy@udel.edu> wrote: I think the understanding problem with this feature arises from two
you know, that may be where part of my confusion came from -- all the talk lately has been about "type hints" and "type annotations" -- the idea of "arbitrary annotations" has been lost.
Therefore, getting the permanent doc 'right' is important.
yup.
would this be possible? @dataclass class C: x: y: = 0 That is -- the colon indicates an annotation, but in this case, it's a "nothing" annotation. It's a syntax error now, but would it be possible to change that? Or would the parsing be ambiguous? particularly in other contexts. of course, then we'd need something to store in as a "nothing" annotation -- empty string? None? (but None might mean something) create yet anther type for "nothing_annotation" Hmm, I may have talked myself out of it.... -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On Thu, Dec 21, 2017 at 3:10 PM MRAB <python@mrabarnett.plus.com> wrote:
pass does not currently parse in that context. Otherwise I was thinking the same thing. But we already have ... which does - so I'd suggest that for people who are averse to importing anything from typing and using the also quite readable Any. (ie: document this as the expected practice with both having the same meaning) While I consider the annotation to be a good feature of data classes, it seems worth documenting that people not running a type analyzer should avoid declaring a type. A worse thing than no-type being specified is a wrong type being specified. That appearing in a library will break people who need their code to pass the analyzer and pytype, mypy, et. al. could be forced to implement a typeshed.pypi of sorts containing blacklists of known bad annotations in public libraries and/or actually correct type specification overrides for them. As for problems with order, if we were to accept @dataclass class Spam: beans = True ham: bool style instead, would it be objectionable to require keyword arguments only for dataclass __init__ methods? That'd get rid of the need to care about order. (but would annoy people with small 2-3 element data classes... so I'm assuming this idea is already rejected) -gps

On Thu, Dec 21, 2017 at 3:36 PM, Gregory P. Smith <greg@krypto.org> wrote:
I don't think they do, actually - I haven't been following the typing discussions, but someone in this thread said that ... means "use the type of teh default" or something like that.
+1 !
and the wrong type could be very common -- folks using "int", when float would do just fine, or "list" when any iterable would do, the list goes on and on. Typing is actually pretty complex in Python -- it's hard to get right, and if you aren't actually running a type checker, you'd never know. One challenge here is that annotations, per se, aren't only for typing. Bu tit would be nice if a type checker could see whatever "non-type" is recommended for dataclasses as "type not specified". Does an ellipses spell that? or None? or anything that doesn't have to be imported from typing :-) As for problems with order, if we were to accept
wouldn't that make the "ham: bool" legal -- i.e. no default? -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

(subject for this sub-thread updated) On Thu, Dec 21, 2017 at 4:08 PM Chris Barker <chris.barker@noaa.gov> wrote:
indeed, they may not. though if that is the definition is it reasonable to say that type analyzers recognize the potential recursive meaning when the _default_ is ... and treat that as Any? another option that crossed my mind was "a: 10" without using =. but that really abuses __attributes__ by sticking the default value in there which the @dataclass decorator would presumably immediately need to undo and fix up before returning the class. but I don't find assigning a value without an = sign to be pythonic so please lets not do that! :)
Yeah, that is true. int vs float vs Number, etc. It suggests means we shouldn't worry about this problem at all for the pep 557 dataclasses implementation. Type analyzers by that definition are going to need to deal with incorrect annotations in data classes as a result no matter what so they'll deal with that regardless of how we say dataclasses should work. -gps

On 2017-12-22 00:19, Gregory P. Smith wrote:
If you allowed "a: 10" (an int value), then you might also allow "a: 'foo'" (a string value), but wouldn't that be interpreted as a type called "foo"? If you can't have a string value, then you shouldn't have an int value either. [snip]

On 12/21/2017 7:55 PM, MRAB wrote:
As far as dataclasses are concerned, both of these are allowed, and since neither is ClassVar or InitvVar, they're ignored. Type checkers would object to the int, and I assume also the string unless there was a type foo defined. See https://www.python.org/dev/peps/pep-0484/#the-problem-of-forward-declaration... and typing.get_type_hints(). It's a bug that dataclasses currently does not inspect string annotations to see if they're actually ClassVar or InitVar declarations. PEP 563 makes it critical (and not merely important) to look at the string annotations. Whether or not that involves typing.get_type_hints() or not, I haven't yet decided. I'm waiting for PEPs 563 and 560 to be implemented before taking another look at it. Eric.

On 21 December 2017 at 11:22, Terry Reedy <tjreedy@udel.edu> wrote:
For me, the three options for "don't care" have a bit different meaning: * typing.Any: this class is supposed to be used with static type checkers, but this field is too dynamic * ... (ellipsis): this class may or may not be used with static type checkers, use the inferred type in the latter case * "field docstring": this class should not be used with static type checkers Assuming this, the second option would be the "real" "don't care". If this makes sense, then we can go the way proposed in https://github.com/python/typing/issues/276 and make ellipsis semantics "official" in PEP 484. (pending Guido's approval) -- Ivan

On Thu, Dec 21, 2017, 03:37 Ivan Levkivskyi, <levkivskyi@gmail.com> wrote:
I vote for option 2 as well. I think it's worth reminding people that if they don't like the fact dataclasses (ab)use type hints for their succinct syntax that you can always use attrs instead to avoid type hints. Otherwise whichever approach we agree to from Ivan's suggestions will take care of this. As for those who feel dataclasses will force them to teach type hints and they simply don't want to, maybe we could help land protocols and then maybe you can use dataclasses as an opportunity to explicitly teach duck typing? But I think the key point I want to make is Guido chose dataclasses to support using the type hints syntax specifically over how attrs does things, so I don't see this thread trying to work around that going anywhere at this point since I haven't seen a solid alternative be proposed after all of this debating. -brett

On Fri, Dec 22, 2017 at 8:49 AM, Brett Cannon <brett@python.org> wrote:
sure -- but this doesn't really address the issue, the whole reason this is even a discussion is because dataclasses is going into the standard library. Third party packages can do whatever they want, of course. And the concern is that people (in particular newbies) will get confused / the wrong impression / other-negative-response by the (semi) use of typing in a standard library module.
As for those who feel dataclasses will force them to teach type hints and they simply don't want to, maybe we could help land protocols
Could you please clarify what this is about ???
And the PEP has been approved. So the actionable things are: Writing good docs Converging on a "recommended" way to do non-typed dataclass fields. And that should be decided in order to write the docs, (and probably should be in the PEP). -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On Fri, Dec 22, 2017 at 11:40 AM Chris Barker <chris.barker@noaa.gov> wrote:
On Fri, Dec 22, 2017 at 8:49 AM, Brett Cannon <brett@python.org>
But I think the key point I want to make is Guido chose dataclasses to
My preference for this is "just use Any" for anyone not concerned about the type. But if we wanted to make it more opaque so that people need not realizing that they are actually type annotations, I suggest adding an alias for Any in the dataclasses module (dataclasses.Data = typing.Any) from dataclasses import dataclass, Data @dataclass class Swallow: weight_in_oz: Data = 5 laden: Data = False species: Data = SwallowSpecies.AFRICAN the word "Data" is friendlier than "Any" in this context for people who don't need to care about the typing module. We could go further and have Data not be an alias for Any if desired (so that its repr wouldn't be confusing, not that anyone should be looking at its repr ever). -gps

On 22 December 2017 at 19:50, Gregory P. Smith <greg@krypto.org> wrote:
That sounds like a nice simple proposal. +1 from me. Documentation can say that variables should be annotated with "Data" to be recognised by the decorator, and if people are using type annotations an actual type can be used in place of "Data" (which acts the same as typing.Any. That seems to me to describe the feature in a suitably type-hinting-neutral way, while still making it clear how data classes interact with type annotations. Paul

On 23 Dec. 2017 9:37 am, "David Mertz" <mertz@gnosis.cx> wrote: There name Data seems very intuitive to me without suggesting type declaration as Any does (but it can still be treated as a synonym by actual type checkers) Type checkers would also be free to interpret it as "infer the type from the default value", rather than necessarily treating it as Any. I still wonder about the "fields *must* be annotated" constraint though. I can understand a constraint that the style be *consistent* (i.e. all fields as annotations, or all fields as field instances), since that's needed to determine the field order, but I don't see the problem with the "no annotations" style otherwise. Cheers, Nick.

On Sat, Dec 23, 2017 at 5:54 PM, Nick Coghlan <ncoghlan@gmail.com> wrote:
IIUC, without annotations, there is no way to set a field with no default. And supporting both approaches violates "only one way to do it" in, I think, a confusing manner -- particularly if you can't mix and match them. Also, could does using class attributes without annotations make a mess when subclassing? -- no I haven't thought that out yet. -CHB
Cheers, Nick.
-- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On 12/26/17 1:49 PM, Chris Barker wrote:
I have not been following the design of dataclasses, and maybe I'm misunderstanding the state of the work. My impression is that attrs was a thing, and lots of people loved it, so we wanted something like it in the stdlib. Data Classes is that thing, but it is a new thing being designed from scratch. There are still big questions about how it should work, but it is already a part of 3.7. Often when people propose new things, we say, "Put it on PyPI first, and let's see how people like it." Why isn't that the path for Data Classes? Why are they already part of 3.7 when we have no practical experience with them? Wouldn't it be better to let the design mature with real experience? Especially since some of the questions being asked are about how it interrelates with another large new feature with little practical use yet (typing)? --Ned.

On Tue, 26 Dec 2017 at 21:00 Ned Batchelder <ned@nedbatchelder.com> wrote:
Yes.
I wouldn't characterize it as "big questions". For some people there's a question as to how to make them work without type hints, but otherwise how they function is settled.
The short answer: "Guido said so". :) The long answer (based on my understanding, which could be wrong :) : Guido liked the idea of an attrs-like thing in the stdlib, but not attrs itself as Guido was after a different API. Eric V. Smith volunteered to work on a solution, and so Guido, Hynek, and Eric got together and discussed things at PyCon US. A design was hashed out, Eric went away and implemented it, and that led to the current solution. The only thing left is some people don't like type hints and so they don't want a stdlib module that requires them to function (there's no issue with how they relate *to* type hints, just how to make dataclasses work *without* type hints). So right now we are trying to decide what should represent the "don't care" type hint.

Brett Cannon writes:
Recently a question has been raised about the decorator overriding methods defined in the class (especially __repr__). People feel that if the class defines a method, the decorator should not override it. The current API requires passing "repr=false" to the decorator.

On Fri, Dec 29, 2017 at 7:10 AM, Stephen J. Turnbull < turnbull.stephen.fw@u.tsukuba.ac.jp> wrote:
I think this is a reasonable question, though I'm not sure how "big" it is. Note that if the *base* class defines __repr__ the decorator should still override it (unless repr=False), since there's always object.__repr__ (and same for most other dunders). We should also (like we did with most questions big and small) look at what attrs does and why. Regarding whether this should live on PyPI first, in this case that would not be helpful, since attrs is already the category killer on PyPI. So we are IMO taking the best course possible given that we want something in the stdlib but not exactly attrs. -- --Guido van Rossum (python.org/~guido)

On 12/29/17 1:59 PM, Guido van Rossum wrote:
It always seemed to me that the reason to recommend putting something on PyPI first wasn't so that it would climb up some kind of leaderboard, but so that people could get real-world experience with it before freezing it into the stdlib. If we think people won't start using data classes from PyPI, why do we think it's important to get into the stdlib? It still seems to me like there are open questions about how data classes should work. Getting people using it will be a good way to get the best design before our hands are tied with backward compatibility in the stdlib. What is the rush to put a new design into the stdlib? Presumably it is better than attrs (or we would have simply adopted attrs). Having data classes on PyPI will be a good way to gauge acceptance. --Ned.

On 30 December 2017 at 11:48, Ned Batchelder <ned@nedbatchelder.com> wrote:
attrs has already proved the utility of the approach, and the differences between the two (such as they are) are mostly cosmetic (attrs even already has a release out that supports the annotation based syntax). The cosmetic differences matter for educational purposes (i.e. "data classes" with "fields", vs trying to explain that "attributes", "attrs", "attr.s", and "attr.ib" are all different things), but "available by default" matters even more on that front. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 22 December 2017 at 20:55, Brett Cannon <brett@python.org> wrote:
If anyone is curious this is PEP 544. It is actually already fully supported by mypy, so that one can play with it (you will need to also install typing_extensions, where Protocol class lives until the PEP is approved). -- Ivan

On Thu, Dec 21, 2017 at 6:39 AM Ivan Levkivskyi <levkivskyi@gmail.com> wrote:
I am a little nervous about using "..." for inferred types, because it could potentially cause confusion with other uses of ellipsis in typing. Ellipsis already has a special meaning for Tuple, so an annotation like MyClass[int, ...] could mean either a tuple subclass with integer elements or a two argument generic type where the second type is inferred. Actually, it's ambiguous even for Tuple. Ellipsis could also make a lot of sense for typing multi-dimensional arrays similar to how it's used in indexing to denote "any number of dimensions." Again, the semantics for "..." might defer from "an inferred size."

On Fri, Dec 22, 2017 at 10:10 AM, Stephan Hoyer <shoyer@gmail.com> wrote:
On Thu, Dec 21, 2017 at 6:39 AM Ivan Levkivskyi <levkivskyi@gmail.com> wrote:
* ... (ellipsis): this class may or may not be used with static type
checkers, use the inferred type in the latter case
Isn't that what "make ellipsis semantics "official"" means -- i.e. making it clear how they are used in typing? The core problem is that generic annotations are used in dataclasses without the "type hints" use-case. But: 1) Python is moving to make (PEP 484) type hints be THE recommended usage for annotations 2) We want the annotations in dataclasses to be "proper" PEP 484 type hints if they are there. The challenge is: - Annotations require a value. - Any value used might be interpreted by a static type checker. So we need a way to spell "no type specified" that will not be mis-interpreted by type checkers, and is in the built in namespace, and will seem natural to users with no knowledge or interest in static typing. The ellipses is tempting, because it's a literal that doesn't have any other obvious meaning in this context. Bu tif it has an incompatible meaning in PEP 484, then we're stuck. Is there another Obscure literal that would work? - I assume None means "the None type" to type checkers, yes? - empty string is one option -- or more to the point, any string -- so then it could be used as docs as well. - Is there another Obscure literal that would work? (or not so obscure one that doesn't have another meaning to type checkers) Would it be crazy to bring typing.Any into the builtin namespace? @dataclass: a: Any b: Any = 34 c: int = 0 That reads pretty well to me.... And having Any available in the built in namespace may help in other cases where type hints are getting introduced into code that isn't really being properly type checked. I don't LOVE it -- to me, Any means "any type will do", or "I don't care what type this is" and what we really want is "no type specified" -- i.e. the same thing as plain old Python code without type hints. But practically speaking, it has the same effect, yes? -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On 2017-12-22 12:15, Chris Barker wrote:
There is already an "any" function in the builtins. It looks fine but not sure how it will interact with type checkers. The "dataclass.Data" idea mentioned in a sibling thread is good alternative, though just wordy enough to make ... a shortcut. -Mike

On Fri, Dec 22, 2017 at 1:18 PM, MRAB <python@mrabarnett.plus.com> wrote:
The function is "any", the type is "Any", and "any" != "Any", although I wonder how many people will be caught out by that...
enough that it's a bad idea.... oh well. -CHB
-- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On 12/21/2017 6:36 AM, Ivan Levkivskyi wrote:
In https://github.com/ericvsmith/dataclasses/issues/2#issuecomment-353918024, Guido has suggested using `object`, which has the benefit of not needing an import. And to me, it communicates the "don't care" aspect well enough. I do understand the difference if you're using a type checker (see for example https://stackoverflow.com/questions/39817081/typing-any-vs-object), but if you care about that, use typing.Any. Eric.

On Mon, Dec 18, 2017 at 11:49 PM, Eric V. Smith <eric@trueblade.com> wrote:
Sure -- but that's documentation of type annotations -- someone uninterested in typing, or completely unaware of it, will not be reading those docs.
Data Classes is also not the first use of type annotations in the stdlib: https://docs.python.org/3/library/typing.html#typing.NamedTuple
That's in the typing package, yes? collections.namedtuple is unchanged. So yes, obviously the entire typing package is about typing. This is something that has nothing to do with typing, but does use the typing syntax. It really is different. I haven't started teaching typing to newbies yet -- but I imagine I will have to some day -- and when I do, it will be in the context of: here is an optional feature that you can use along with a static type checker. And I can make it clear that the annotations only apply to the static type checker, and not run-time behavior. But using type annotations for something other than providing information to a static type checker, in an stdlib module, changes that introduction. And people don't read all the docs -- they read to the first example of how to use it, and away they go. And if that example is something like: @dataclass class C: a: int b: float = 0.0 There WILL be confusion. Paul Moore wrote:
That suggests to me that the people involved in discussing the PEP may not be representative of the bulk of Python users. There are a number of us that are uncomfortable with static typing in general, and the python-dev community has been criticised for doing too much, moving too fast, and complicating the language unnecessarily. The PEP's been accepted, so let's move forward, but please be aware of these issues with the documentation and examples. I'll try to contribute to that discussion as well. -CHB -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker@noaa.gov

On 19 Dec. 2017 7:00 am, "Chris Barker" <chris.barker@noaa.gov> wrote: Are there other options?? plain old: @dataclass class C: a = 1 b = 1.0 would work, though then there would be no way to express fields without defaults: The PEP already supports using "a = field(); b = field()" (etc) to declare untyped fields without a default value. This annotation free spelling may not be clearly covered in the current module docs, though. Cheers, Nick.

On 18 December 2017 at 20:38, Nick Coghlan <ncoghlan@gmail.com> wrote:
The PEP is not 100% clear not this, but it is currently not the case and this may be intentional (one obvious way to do it), I just tried and this does not work: @dataclass class C: x = field() generates `__init__` etc. with no arguments. I think however that it is better to generate an error than silently ignore it. (Or if this a bug in the implementation, it should be just fixed.) -- Ivan

On 12/18/2017 2:55 PM, Ivan Levkivskyi wrote:
Hmm, not sure why that doesn't generate an error. I think it's a bug that should be fixed. Or, we could make the same change we're making in make_dataclass(), where we'll use "typing.Any" (as a string) if the type is omitted. See https://bugs.python.org/issue32278.

On 11 Dec. 2017 12:26 pm, "Eric V. Smith" <eric@trueblade.com> wrote: I see a couple of options: 1a: Use a default type annotation, if one is not is supplied. typing.Any would presumably make the most sense. 1b: Use None if not type is supplied. 2: Rework the code to not require annotations at all. 1c: annotate with the string "typing.Any" (this may require a tweak to the rules for evaluating lazy annotations, though) Cheers, Nick.

On 12/10/2017 5:00 PM, Raymond Hettinger wrote:
This is bpo-32278.
This is bpo-32279.
2) Change the default value for "hash" from "None" to "False". This might take a little effort because there is currently an oddity where setting hash=False causes it to be hashable. I'm pretty sure this wasn't intended ;-)
No time for this one yet. Soon! Eric.

On 12/10/2017 5:00 PM, Raymond Hettinger wrote:
I've checked this under bpo-32278.
And I've checked this in under bpo-32279.
2) Change the default value for "hash" from "None" to "False". This might take a little effort because there is currently an oddity where setting hash=False causes it to be hashable. I'm pretty sure this wasn't intended ;-)
I haven't looked at this yet. Eric.

On 1/6/2018 5:13 PM, Eric V. Smith wrote:
On 12/10/2017 5:00 PM, Raymond Hettinger wrote:
...
I think the hashing logic explained in https://bugs.python.org/issue32513#msg310830 is correct. It uses hash=None as the default, so that frozen=True objects are hashable, which they would not be if hash=False were the default. If there's some case there that you disagree with, I'd be interested in hearing about it. That logic is what is currently scheduled to go in to 3.7 beta 1. I have not updated the PEP yet, mostly because it's so difficult to explain. What's the case where setting hash=False causes it to be hashable? I don't think that was ever the case, and I hope it's not the case now. Eric

Wouldn't it be simpler to make the options orthogonal? Frozen need not imply hashable. I would think if a user wants frozen and hashable, they could just write frozen=True and hashable=True. That would more explicit and clear than just having frozen=True imply that hashability gets turned-on implicitly whether you want it or not.
If there's some case there that you disagree with, I'd be interested in hearing about it.
That logic is what is currently scheduled to go in to 3.7 beta 1. I have not updated the PEP yet, mostly because it's so difficult to explain.
That might be a strong hint that this part of the API needs to be simplified :-) "If the implementation is hard to explain, it's a bad idea." -- Zen If for some reason, dataclasses really do need tri-state logic, it may be better off with enum values (NOT_HASHABLE, VALUE_HASHABLE, IDENTITY_HASHABLE, HASHABLE_IF_FROZEN or some such) rather than with None, True, and False which don't communicate enough information to understand what the decorator is doing.
What's the case where setting hash=False causes it to be hashable? I don't think that was ever the case, and I hope it's not the case now.
Python 3.7.0a4+ (heads/master:631fd38dbf, Jan 28 2018, 16:20:11) [GCC 7.2.0] on darwin Type "copyright", "credits" or "license()" for more information.
hash(A(1)) 285969507
I'm hoping that this part of the API gets thought through before it gets set in stone. Since dataclasses code never got a chance to live in the wild (on PyPI or some such), it behooves us to think through all the usability issues. To me at least, the tri-state hashability was entirely unexpected and hard to debug -- I had to do a close reading of the source to figure-out what was happening. Raymond

I think this is a good candidate for fine-tuning during the beta period. Though honestly Python's own rules for when a class is hashable or not are the root cause for the complexity here -- since we decided to implicitly set __hash__ = None when you define __eq__, it's hardly surprising that dataclasses are having a hard time making natural rules. On Sun, Jan 28, 2018 at 5:07 PM, Raymond Hettinger < raymond.hettinger@gmail.com> wrote:
-- --Guido van Rossum (python.org/~guido)

On 29 January 2018 at 12:08, Guido van Rossum <guido@python.org> wrote:
In Raymond's example, the problem is the opposite: data classes are currently interpreting "hash=False" as "Don't add a __hash__ implementation" rather than "Make this unhashable". That interpretation isn't equivalent due to object.__hash__ existing by default. (Reviewing Eric's table again, I believe this problem still exists in the 3.7b1 variant as well - I just missed it the first time I read that) I'd say the major argument in favour of Raymond's suggestion (i.e. always requiring an explicit "hash=True" in the dataclass decorator call if you want the result to be hashable) is that even if we *do* come up with a completely consistent derivation rule that the decorator can follow, most *readers* aren't going to know that rule. It would become a Python gotcha question for tech interviews: ============= Which of the following class definitions are hashable and what is their hash based on?: @dataclass class A: field: int @dataclass(eq=False) class B: field: int @dataclass(frozen=True) class C: field: int @dataclass(eq=False, frozen=True) class D: field: int @dataclass(eq=True, frozen=True) class E: field: int @dataclass(hash=True) class F: field: int @dataclass(frozen=True, hash=True) class G: field: int @dataclass(eq=True, frozen=True, hash=True) class H: field: int ============= Currently the answers are: - A: not hashable - B: hashable (by identity) # Wat? - C: hashable (by field hash) - D: hashable (by identity) # Wat? - E: hashable (by field hash) - F: hashable (by field hash) - G: hashable (by field hash) - H: hashable (by field hash) If we instead make the default "hash=False" (and interpret that as meaning "Inject __hash__=None"), then you end up with the following much simpler outcome that can be mapped directly to the decorator "hash" parameter: - A: not hashable - B: not hashable - C: not hashable - D: not hashable - E: not hashable - F: hashable (by field hash) - G: hashable (by field hash) - H: hashable (by field hash) Inheritance of __hash__ could then be made explicitly opt-in by way of a "dataclasses.INHERIT" constant. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

On 1/29/2018 1:55 AM, Yury Selivanov wrote:
I agree it's complicated. I think it would be a bad design to have to opt-in to hashability if using frozen=True. The point of hash=None (the default) is to try and get the simple cases right with the simplest possible interface. It's the intersection of "have simple defaults, but ways to override them" with "if the user provides some dunder methods, don't make them specify feature=False in order to use them" that complicated things. For example, maybe we no longer need eq=False now that specifying a __eq__ turns off dataclasses's __eq__ generation. Does dataclasses really need a way of using object identity for equality? Eric.

On Jan 28, 2018, at 11:52 PM, Eric V. Smith <eric@trueblade.com> wrote:
I think it would be a bad design to have to opt-in to hashability if using frozen=True.
I respect that you see it that way, but it doesn't make sense to me. You can have either one without the other. It seems to me that it is clearer and more explicit to just say what you want rather than having implicit logic guess at what you meant. Otherwise, when something goes wrong, it is difficult to debug. The tooltips for the dataclass decorator are essentially of checklist of features that can be turned on or off. That list of features is mostly easy-to-use except for hash=None which has three possible values, only one of which is self-evident. We haven't had much in the way of user testing, so it is a significant data point that one of your first users (me) found was confounded by this API. I recommend putting various correct and incorrect examples in front of other users (preferably experienced Python programmers) and asking them to predict what the code does based on the source code. Raymond

On 1/29/2018 4:01 AM, Raymond Hettinger wrote:
I certainly respect your insights.
The tooltips for the dataclass decorator are essentially of checklist of features that can be turned on or off. That list of features is mostly easy-to-use except for hash=None which has three possible values, only one of which is self-evident.
Which is the one that's self-evident? I would think hash=False, correct? The problem is that for repr=, eq=, compare=, you're saying "do or don't add this/these methods, or if true, don't even add it if it's already defined". The same is true for hash=True/False, with the complication of the implicit __hash__ that's added by __eq__. In addition to "do or don't add __hash__", there needs to be a way of setting __hash__=None. The processing of hash=None is trying to guess what sort of __hash__ you want: not set it and just inherit it, generate it based on fields, or set it to None. And if it guesses wrong, based on the fairly simple hash=None rules, you can control it with other values of hash=. Maybe that's the problem. I'm open to ways to express these options. Again, I think losing "do the right thing most of the time without explicitly setting hash=" would be a shame, but not the end of the world. And changing it to "hashable=" isn't quite as simple as it seems, since there's more than one definition of hashable: identity-based or field-based.
We haven't had much in the way of user testing, so it is a significant data point that one of your first users (me) found was confounded by this API. I recommend putting various correct and incorrect examples in front of other users (preferably experienced Python programmers) and asking them to predict what the code does based on the source code.
I agree it's sub-optimal, but it's a complex issue. What would the interface look like that allowed a programmer to know if an object was hashable based on object identity versus field values? Eric.

I don't think we're going to reach full agreement here, so I'm going to put my weight behind Eric's rules. I think the benefit of the complicated rules is that they almost always do what you want, so you almost never have to think about it. If it doesn't do what you want, setting hash=False or hash=True is much quicker than trying to understand the rules. But the rules *are* deterministic and reasonable. -- --Guido van Rossum (python.org/~guido)

On 1/29/2018 3:42 AM, Ethan Furman wrote:
It means "don't add a __hash__ attribute, and rely on the base class value". But maybe it should mean "is not hashable". But in that case, how would we specify the "don't add __hash__" case? Note that "repr=False" means "don't add a __repr__", not "is not repr-able". And "init=False" means "don't add a __init__", not "is not init-able". Eric.

On 01/29/2018 12:57 AM, Eric V. Smith wrote:
On 1/29/2018 3:42 AM, Ethan Furman wrote:
On 01/28/2018 07:45 AM, Eric V. Smith wrote:
I thought `hash=False` means don't add a __hash__ method..
Note that "repr=False" means "don't add a __repr__", not "is not repr-able". And "init=False" means "don't add a __init__", not "is not init-able".
Yeah, like that. I get that the default for all (or at least most) of the boring stuff should be "just do it", but I don't think None is the proper place-holder for that. Why not make an `_default = object()` sentinel and use that for the default? At least for __hash__. Then we have: hash=False -> don't add one hash=None -> add `__hash__ = None` (is not hashable) hash=True -> add one (the default... Okay, after writing that down, why don't we have the default value for anything automatically added be True? With True meaning the dataclass should have a custom whatever, and if the programmer did not provide one the decorator will -- it can even be a self-check: if the parameters in the decorator are at odds with the actual class contents (hash=None, but the class has a __hash__ method) then an exception could be raised. -- ~Ethan~
participants (26)
-
Antoine Pitrou
-
Barry Warsaw
-
Brett Cannon
-
Chris Barker
-
Chris Barker - NOAA Federal
-
David Mertz
-
Eric V. Smith
-
Ethan Furman
-
Greg Ewing
-
Gregory P. Smith
-
Guido van Rossum
-
Ivan Levkivskyi
-
Julien Salort
-
Mike Miller
-
MRAB
-
Ned Batchelder
-
Nick Coghlan
-
Paul Moore
-
Raymond Hettinger
-
Rob Cliffe
-
Stephan Hoyer
-
Stephen J. Turnbull
-
Steve Holden
-
Sven R. Kunze
-
Terry Reedy
-
Yury Selivanov