[Python-Dev] PEP 557: Data Classes

Michel Desmoulin desmoulinmichel at gmail.com
Mon Sep 11 02:38:06 EDT 2017



Le 10/09/2017 à 18:36, Eric V. Smith a écrit :
> On 9/10/2017 10:00 AM, Michel Desmoulin wrote:
>> The reaction is overwhelmingly positive everywhere: hacker news, reddit,
>> twitter.
> 
> Do you have a pointer to the Hacker News discussion? I missed it.

Err... I may have been over enthusiastic and created the hacker news
thread in my mind.

> 
>> People have been expecting something like that for a long time.
> 
> Me, too!
> 
>> 3 questions:
>>
>> - is providing validation/conversion hooks completely out of the
>> question of still open for debate ? I know it's to keep the
>> implementation simple but have a few callbacks run in the __init__ in a
>> foo loop is not that much complexity. You don't have to provide
>> validators, but having a validators parameters on field() would be a
>> huge time saver. Just a list of callables called when the value is first
>> set, potentially raising an exception that you don't even need to
>> process in any way. It returns the value converted, and voilà. We all do
>> that every day manually.
> 
> I don't particularly want to add validation specifically. I want to make
> it possible to add validation yourself, or via a library.
> 
> What I think I'll do is add a metadata parameter to fields(), defaulting
> to None. Then you could write a post-init hook that does whatever
> single- and multi-field validations you want (or whatever else you want
> to do). Although this plays poorly with "frozen" classes: it's always
> something! I'll think about it.
> 
> To make this most useful, I need to get the post-init hook to take an
> optional parameter so you can get data to it. I don't have a good way to
> do this, yet. Suggestions welcomed.

It doesn't really allow you to do anything you couldn't do as easily as
in __init__.

Alternatively, you could have a "on_set" hooks for field(), that just
take the field value, and return the field value. By default, it's an
identity function and is always called (minus implementation optimizations):

from functools improt reduce
self.foo = reduce((lambda data, next: next(*data)), on_set_hooks,
(field, val))

And people can do whatever they want: default values, factories,
transformers, converters/casters, validation, logging...

> 
> Although if the post-init hook takes a param that you can pass in at
> object creation time, I guess there's really no need for a per-field
> metadata parameter: you could use the field name as a key to look up
> whatever you wanted to know about the field.
> 
>> - I read Guido talking about some base class as alternative to the
>> generator version, but don't see it in the PEP. Is it still considered ?
> 
> I'm going to put some words in explaining why I don't want to use base
> classes (I don't think it buys you anything). Do you have a reason for
> preferring base classes?

Not preferring, but having it as an alternative. Mainly for 2 reasons:

1 - data classes allow one to type in classes very quickly, let's
harvest the benefit from that.

Typing a decorator in a shell is much less comfortable than using
inheritance. Same thing about IDE: all current ones have snippet with
auto-switch to the class parents on tab.

All in all, if you are doing exploratory programming, and thus
disposable code, which data classes are fantastic for, inheritance will
keep you in the flow.

2 - it will help sell the data classes

I train a lot of people to Python each year. I never have to explain
classes to people with any kind of programming background. I _always_
have to explain decorators.

People are not used to it, and even kind fear it for quite some time.

Inheritance however, is familiar, and will not only push people to use
data classes more, but also will let them do less mistakes: they know
the danger of parent ordering, but not the ones of decorators ordering.

> 
>> - any chance it becomes a built in later ? When classes have been
>> improved in Python 2, the object built-in was added. Imagine if we had
>> had to import it every time... Or maybe just plug it to object like
>> @object.dataclass.
> 
> Because of the choice of using module-level functions so as to not
> introduce conflicts in the object's namespace, it would be difficult to
> make this a builtin.
> 
> Although now that I think about it, maybe what are currently
> module-level functions should instead be methods on the "dataclass"
> decorator itself:
> 
> @dataclass
> class C:
>   i: int = dataclass.field(default=1, init=False)
>   j: str
> 
> c = C('hello')
> 
> dataclass.asdict(c)
> {'i': 1, 'j': 'hello'}
> 
> Then, "dataclass" would be the only name the module exports, making it
> easier to someday be a builtin. I'm not sure it's important enough for
> this to be a builtin, but this would make it easier. Thoughts? I'm
> usually not a fan of having attributes on a function: it's why
> itertools.chain.from_iterable() is hard to find.
> 
> Eric.


More information about the Python-Dev mailing list