[Python-Dev] Second post: PEP 557, Data Classes

Eric V. Smith eric at trueblade.com
Mon Nov 27 07:23:48 EST 2017


On 11/27/2017 6:01 AM, Sebastian Rittau wrote:
> On 25.11.2017 22:06, Eric V. Smith wrote:

>> The major changes from the previous version are:
>>
>> - Add InitVar to specify initialize-only fields. 
> 
> This is the only feature that does not sit right with me. It looks very 
> obscure and "hacky". From what I understand, we are supposed to use the 
> field syntax to define constructor arguments. I'd argue that the name 
> "initialize-only fields" is a misnomer, which only hides the fact that 
> this has nothing to do with fields at all. Couldn't dataclassses just 
> pass *args and **kwargs to __post_init__()? Type checkers need to be 
> special-cases for InitVar anyway, couldn't they instead be special cased 
> to look at __post_init__ argument types?

First off, I expect this feature to be used extremely rarely. I'm 
tempted to remove it, since it's infrequently needed and it could be 
added later. And as the PEP points out, you can get most of the way with 
an alternate classmethod constructor.

I had something like your suggestion half coded up, except I inspected 
the args to __post_init__() and added them to __init__, avoiding the 
API-unfriendly *args and **kwargs.

So in:
@dataclass
class C:
     x: int
     y: int

     def __post_init__(self, database: DatabaseType): pass

Then the __init__ signature became:

def __init__(self, x:int, y:int, database:DatabaseType):

In the end, that seems like a lot of magic (but what about this isn't?), 
it required the inspect module to be imported, and I thought it made 
more sense for all of the init params to be near each other:

@dataclass
class C:
     x: int
     y: int
     database: InitVar[DatabaseType]

     def __post_init__(self, database): pass

No matter what we do here, static type checkers are going to have to be 
aware of either the InitVars or the hoisting of params from 
__post_init__ to __init__.

One other thing about InitVar: it lets you control where the init-only 
parameter goes in the __init__ call. This is especially important with 
default values:

@dataclass
class C:
     x: int
     database: InitVar[DatabaseType]
     y: int = 0

     def __post_init__(self, database): pass

In this case, if I were hoisting params from __post_init__ to __init__, 
the __init__ call would be:

def __init__(self, x, y=0, database)

Which is an error. I guess you could say the init-only parameters would 
go first in the __init__ definition, but then you have the same problem 
if any of them have default values.

Eric.


More information about the Python-Dev mailing list