[Python-Dev] Second post: PEP 557, Data Classes

Mon Nov 27 11:00:27 EST 2017

On 11/27/17 10:51 AM, Guido van Rossum wrote:
> Following up on this subthread (inline below).
>
> On Mon, Nov 27, 2017 at 2:56 AM, Eric V. Smith <eric at trueblade.com
> <mailto:eric at trueblade.com>> wrote:
>
>     On 11/27/2017 1:04 AM, Nick Coghlan wrote:
>
>         On 27 November 2017 at 15:04, Greg Ewing
>         <greg.ewing at canterbury.ac.nz
>         <mailto:greg.ewing at canterbury.ac.nz>> wrote:
>
>             Nick Coghlan wrote:
>
>
>                 Perhaps the check could be:
>
>                    (type(lhs) == type(rhs) or fields(lhs) ==
>                 fields(rhs)) and all
>                 (individual fields match)
>
>
>
>             I think the types should *always* have to match, or at least
>             one should be a subclass of the other. Consider:
>
>             @dataclass
>             class Point3d:
>                  x: float
>                  y: float
>                  z: float
>
>             @dataclass
>             class Vector3d:
>                  x: float
>                  y: float
>                  z: float
>
>             Points and vectors are different things, and they should never
>             compare equal, even if they have the same field names and
>             values.
>
>
>         And I guess if folks actually want more permissive structure-based
>         matching, that's one of the features that collections.namedtuple
>         offers that data classes don't.
>
>
>     And in this case you could also do:
>     astuple(point) == astuple(vector)
>
>
> Didn't we at one point have something like
>
> isinstance(other, self.__class__) and fields(other) == fields(self) and
> <all individual fields match>
>
> (plus some optimization if the types are identical)?
>
> That feels ideal, because it means you can subclass Point just to add
> some methods and it will stay comparable, but if you add fields it will
> always be unequal.

I don't think we had that before, but it sounds right to me. I think it 
could be:

isinstance(other, self.__class__) and len(fields(other)) == 
len(fields(self)) and <all individual fields match>

Since by definition if you're a subclass you'll start with all of the 
same fields. So if the len's match, you won't have added any new fields. 
That should be sufficiently cheap.

Then the optimized version would be:

(self.__class__ is other.__class__) or (isinstance(other, 
self.__class__) and len(fields(other)) == len(fields(self))) and <all 
individual fields match>

I'd probably further optimize len(fields(obj)), but that's the general idea.

Eric.