Simple class initialization

0. Abstract =========== A class initialization often begins with a long list of explicit variable declaration statements at the __init__() method. This repetitively copies arguments into local data attributes. This article suggests some semi-automatic techniques to shorten and clarify this code section. Comments and responses are highly appreciated. 1. Credit ========= The idea emerged from my question at stackoverflow.com. I would like to thank all those who answered and commented on that thread. http://stackoverflow.com/questions/1389180 2. The problem ============== Consider the following class: class Process: def __init__(self, pid, ppid, cmd, fd, reachable, user): If the instance needs to hold these arguments internally, local data attributes are declared and the value of the argument is being copied, using two mainstream notations: a. self.pid=pid self.ppid=ppid self.cmd=cmd self._fd=fd self.reachable=reachable self.user=user b. self.pid, self.ppid, self.cmd, self._fd, self.reachable, self.user = pid, ppid, cmd, fd, reachable, user a. takes an unreasonable amount of lines and has a repetative form. b. is long and prone to errors, especially when the argument list changes during development. 3. Solution outline =================== 1. Generally comply with the Zen of Python. 1. Explicit. The instance should not store any value unless told. 2. Short. 3. Backward compatible. Current Class syntax should work with the new solution. 4. Readable and intuitive. 5. Flexible. There should be a way to store any given subset of the arguments. 6. Allow storage of "private" variables by adding a single or double underscore before the variable name. 4. Solutions ============ 4.1 Decorator ------------- Nadia Alramli suggested this at the aforementiond thread at stackoverflow: from functools import wraps import inspect def initializer(fun): names, varargs, keywords, defaults = inspect.getargspec(fun) @wraps(fun) def wrapper(self, *args): for name, arg in zip(names[1:], args): setattr(self, name, arg) fun(self, *args) return wrapper class Process: @initializer def __init__(self, pid, ppid, cmd, fd, reachable, user) Pros: Simple, short, explicit and intuitive. Easy to add to the standard library, fully backward-compatible. Cons: Stores all arguments. Does not support private data attributes notation (underscore prefix). See http://stackoverflow.com/questions/1389180/python-automatically-initialize-i... 4.2. Argument tagging --------------------- Arguments that needed to be stored within the instance could be marked with a special character, e.g. '~'. The character would be placed after the argument name for private variables: class Process: def __init__(self, ~pid, ~ppid, ~cmd, fd~, ~reachable, ~user) Pros: Simple, short and explicit. Can store any subset of the arguments. Supports private variable notation. Cons: Not intuitive. Changes the method signature and might be confusing. 4.3 Standard function --------------------- A function will be called to store the argument as data attributes. class Process: def __init__(self, pid, ppid, cmd, fd, reachable, user) acquire(pid, ppid, cmd, reachable, user) acquire(fd, prefix='_') Possible keywords can ba acquire, store, absorp. Pros: Explicit, clear and intuitive. Cons: Long - especially if more than a single prefix is used. 4.4 Initialization list ----------------------- The argument list would include the name of the local data attribute, a separator, and the argument name. class Process: def __init__(self, pid:pid, ppid:ppid, cmd:cmd, _fd:fd, reachable:reachable, user:user) """ pid, ppid, cmd, reachable and user are stored as data properties with the same name. fd is stored as _fd.""" Or: class Process: def __init__(self, :pid, :ppid, :cmd, _fd:fd, :reachable, :user) """Same, but local data attributes with the same name as arguments would be stored without stating their name twice.""" This is a developed argument tagging (4.2). Pros: See 4.2 Cons: Alters the method signature Not intuitive. Looking forward for comments, Adam matan

I think that this solution damages the __init__() signature because the caller does not know which arguments should be passed. Furthermore, it is quite long, and does not allow introspection. On Sat, Apr 16, 2011 at 3:08 PM, dag.odenhall@gmail.com < dag.odenhall@gmail.com> wrote:

dag.odenhall@gmail.com wrote:
[snippers]
I like the initialiser function myself: 8<------------------------------------------------------------- def acquire(obj, kwargs): Missing = object() for kw, val in kwargs.items(): name = '_'+kw attr = getattr(obj, name, Missing) if attr is Missing: name = kw attr = getattr(obj, name, Missing) if attr is not Missing: setattr(obj, name, val) class Process: pid = None ppid = None cmd = None reachable = None user = None _fd = None def __init__(self, pid, ppid, cmd, fd, reachable, user): acquire(self, locals()) print(self.pid) print(self.ppid) print(self.cmd) print(self.reachable) print(self.user) print(self._fd) if __name__ == '__main__': p = Process(9, 101, 'cd /', 0, 'yes', 'root') 8<------------------------------------------------------------- Don't think it needs to be in the stdlib, though. ~Ethan~

I like the initializer decorator. Here's a version that (1) handles keyword argument defaults, and (2) allows only a subset of arguments to be stored via decorator arguments: def initializer(*selectedArgs): def wrap(fun): names, varargs, varkwargs, defaults = inspect.getargspec(fun) @wraps(fun) def wrapper(self, *args, **kwargs): d = dict(zip(names[-len(defaults):],defaults)) d.update(dict(zip(names[1:], args))) d.update(kwargs) for a in (selectedArgs if len(selectedArgs)>0 else d.keys()): assert a in names,'Invalid parameter name: {}'.format(a) assert a in d,'Missing required argument: {}'.format(a) setattr(self, a, d[a]) fun(self, *args, **kwargs) return wrapper return wrap class Process1: @initializer() def __init__(self, pid, ppid, cmd, fd, reachable=True, user=None) class Process2: @initializer('pid','ppid','user') # only store these 3; self.cmd will trigger an error def __init__(self, pid, ppid, cmd, fd, reachable=True, user=None) Nathan On Sat, Apr 16, 2011 at 12:25 PM, Ethan Furman <ethan@stoneleaf.us> wrote:

On 16 Apr 2011, at 12:50, Adam Matan wrote:
Following a discussion on c.l.python, I posted a recipe on ActiveState a while ago that attempted to deal with this issue: http://code.activestate.com/recipes/551763-automatic-attribute-assignment/ From the docstring: """ autoassign(function) -> method autoassign(*argnames) -> decorator autoassign(exclude=argnames) -> decorator allow a method to assign (some of) its arguments as attributes of 'self' automatically. E.g. >>> class Foo(object): ... @autoassign ... def __init__(self, foo, bar): pass ... >>> breakfast = Foo('spam', 'eggs') >>> breakfast.foo, breakfast.bar ('spam', 'eggs') To restrict autoassignment to 'bar' and 'baz', write: @autoassign('bar', 'baz') def method(self, foo, bar, baz): ... To prevent 'foo' and 'baz' from being autoassigned, use: @autoassign(exclude=('foo', 'baz')) def method(self, foo, bar, baz): ... """ -- Arnaud

On Sat, Apr 16, 2011 at 4:50 AM, Adam Matan <adam@matan.name> wrote:
You could easily combine 4.1 and 4.2 by using function annotations (PEP 3107 - http://www.python.org/dev/peps/pep-3107/ ); this would eliminate the need to add any new syntax. Example: from wherever import initializer, Private as Priv, Public as Pub class Process: @initializer def __init__(self, pid: Pub, ppid: Pub, cmd: Pub, fd: Priv, reachable: Pub, user: Pub):
-1; I strongly disagree. This function would have to be rather magical since `self` isn't passed to it; it would have to mess with call stack frames to grab `self` from the caller's scope. I'm unsure whether that would be resilient in the face of methods which name `self` something else (e.g. `s`), and whether that would be easily portable to non-CPython implementations.
Again, I think function annotations would be a better approach here. Example: class Process: @initializer def __init__(self, pid: 'pid', ppid: 'ppid', cmd: 'cmd', fd: '_fd', reachable: 'reachable', user: 'user'): or allowing for more implicitness: class Process: @initializer def __init__(self, pid, ppid, cmd, fd: '_fd', reachable, user): Cheers, Chris -- http://blog.rebertia.com

For: def __init__(self, pid:pid, ppid:ppid, cmd:cmd, _fd:fd, reachable:reachable, user:user) This either conflicts with parameter annotations or you've got the annotation on the wrong side (and annotations are expressions so this won't work). I had a similar idea to what Chris Rebert suggested: from somewhere import auto_init @auto_init def __init__(self, pid, ppid, cmd, fd:[auto_init.private], reachable:[ auto_init.skip], user:[auto_init.name('user_name')]) blah The annotation auto_init.private is equivalent to auto_init.name('_'+* parameter_name*). Note that I wrote fd:[auto_init.private] instead of auto_init.private. One of the strange aspects (to me) of parameter annotations is that they have no semantics which opens them up to multiple conflicting uses. If we standardize on a convention that the annotation is a list (or tuple) of annotations, then this leads us to usage like foo:[auto_init.name('bar'),constraint.non_negative,etc]. --- Bruce *New! *Puzzazz newsletter: http://j.mp/puzzazz-news-2011-04 including April Fools! *New!** *Blog post: http://www.vroospeak.com Ironically, a glaring Google grammatical error

Bruce Leban wrote:
That's not a bug, that's a feature. It's been stated many times by Guido that it's far too early to standardize on a single meaning for annotations. (We may *never* standardize on a single meaning.) Instead, it is up to the library or decorator to impose whatever meaning makes sense for that particular library or decorator. -- Steven

On Sun, Apr 17, 2011 at 3:43 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
I have a use case- I use them to wrap C functions using ctypes in a semi-sane way. Old way: RAND_bytes = libraries['ssl'].RAND_bytes RAND_bytes.restype = ctypes.c_int RAND_bytes.argtypes = [ctypes.c_char_p, ctypes.c_int] My way: @C_function("ssl") def RAND_bytes(iv: c_char_p, iv_length: c_int) -> c_int: return RAND_bytes.c_function(iv, iv_length) IIRC, this came up during the initial discussion and wasn't widely loved, but for me (I do a reasonable amount of glue work) it makes life a lot simpler. Geremy Condra

Greg Ewing wrote:
The benefits of being BDFL :) But seriously, check the PEP: it's not written by Guido, and this feature is not driven by whim. http://www.python.org/dev/peps/pep-3107/ Annotations have at least one good use-case, and the semantics are perfectly defined: annotations are arbitrary expressions, and they get stored in the function object in a known place. What you do with those annotations is up to you. That's no different from other general processes in Python, like name binding, class attributes, and function calling, which have open semantics. ("Okay, I've created a variable. What do I do with it now?" That's entirely up to you, Python won't tell you what to do next.) The only difference is that those other processes are so well-known and have existed in some cases since the dawn of time (Fortran), and so we take them for granted. Annotations in the Python sense are new, and nobody knows what to do with them yet (except for the oh-so-predictable idea of type testing -- boring!). I think its a brave and innovative move. If it's not successful, that says more about the conservativeness of Python programmers than the usefulness of the feature. I bet Perl coders would have found some way to make their code even more incomprehensible with it by now *grin* More here on type checking in Python here: http://lambda-the-ultimate.org/node/1519 In particular note the links to Guido's essays thinking aloud, which eventually lead to annotations. -- Steven

On Sun, Apr 17, 2011 at 10:46 AM, Bruce Leban <bruce@leapyear.org> wrote:
The idea is for annotations to be paired with decorators that define the semantics. Yes, that does mean that they aren't composable - decorators need to provide an alternate initialisation mechanism for cases where the annotations are already being used for something else. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

I think that this solution damages the __init__() signature because the caller does not know which arguments should be passed. Furthermore, it is quite long, and does not allow introspection. On Sat, Apr 16, 2011 at 3:08 PM, dag.odenhall@gmail.com < dag.odenhall@gmail.com> wrote:

dag.odenhall@gmail.com wrote:
[snippers]
I like the initialiser function myself: 8<------------------------------------------------------------- def acquire(obj, kwargs): Missing = object() for kw, val in kwargs.items(): name = '_'+kw attr = getattr(obj, name, Missing) if attr is Missing: name = kw attr = getattr(obj, name, Missing) if attr is not Missing: setattr(obj, name, val) class Process: pid = None ppid = None cmd = None reachable = None user = None _fd = None def __init__(self, pid, ppid, cmd, fd, reachable, user): acquire(self, locals()) print(self.pid) print(self.ppid) print(self.cmd) print(self.reachable) print(self.user) print(self._fd) if __name__ == '__main__': p = Process(9, 101, 'cd /', 0, 'yes', 'root') 8<------------------------------------------------------------- Don't think it needs to be in the stdlib, though. ~Ethan~

I like the initializer decorator. Here's a version that (1) handles keyword argument defaults, and (2) allows only a subset of arguments to be stored via decorator arguments: def initializer(*selectedArgs): def wrap(fun): names, varargs, varkwargs, defaults = inspect.getargspec(fun) @wraps(fun) def wrapper(self, *args, **kwargs): d = dict(zip(names[-len(defaults):],defaults)) d.update(dict(zip(names[1:], args))) d.update(kwargs) for a in (selectedArgs if len(selectedArgs)>0 else d.keys()): assert a in names,'Invalid parameter name: {}'.format(a) assert a in d,'Missing required argument: {}'.format(a) setattr(self, a, d[a]) fun(self, *args, **kwargs) return wrapper return wrap class Process1: @initializer() def __init__(self, pid, ppid, cmd, fd, reachable=True, user=None) class Process2: @initializer('pid','ppid','user') # only store these 3; self.cmd will trigger an error def __init__(self, pid, ppid, cmd, fd, reachable=True, user=None) Nathan On Sat, Apr 16, 2011 at 12:25 PM, Ethan Furman <ethan@stoneleaf.us> wrote:

On 16 Apr 2011, at 12:50, Adam Matan wrote:
Following a discussion on c.l.python, I posted a recipe on ActiveState a while ago that attempted to deal with this issue: http://code.activestate.com/recipes/551763-automatic-attribute-assignment/ From the docstring: """ autoassign(function) -> method autoassign(*argnames) -> decorator autoassign(exclude=argnames) -> decorator allow a method to assign (some of) its arguments as attributes of 'self' automatically. E.g. >>> class Foo(object): ... @autoassign ... def __init__(self, foo, bar): pass ... >>> breakfast = Foo('spam', 'eggs') >>> breakfast.foo, breakfast.bar ('spam', 'eggs') To restrict autoassignment to 'bar' and 'baz', write: @autoassign('bar', 'baz') def method(self, foo, bar, baz): ... To prevent 'foo' and 'baz' from being autoassigned, use: @autoassign(exclude=('foo', 'baz')) def method(self, foo, bar, baz): ... """ -- Arnaud

On Sat, Apr 16, 2011 at 4:50 AM, Adam Matan <adam@matan.name> wrote:
You could easily combine 4.1 and 4.2 by using function annotations (PEP 3107 - http://www.python.org/dev/peps/pep-3107/ ); this would eliminate the need to add any new syntax. Example: from wherever import initializer, Private as Priv, Public as Pub class Process: @initializer def __init__(self, pid: Pub, ppid: Pub, cmd: Pub, fd: Priv, reachable: Pub, user: Pub):
-1; I strongly disagree. This function would have to be rather magical since `self` isn't passed to it; it would have to mess with call stack frames to grab `self` from the caller's scope. I'm unsure whether that would be resilient in the face of methods which name `self` something else (e.g. `s`), and whether that would be easily portable to non-CPython implementations.
Again, I think function annotations would be a better approach here. Example: class Process: @initializer def __init__(self, pid: 'pid', ppid: 'ppid', cmd: 'cmd', fd: '_fd', reachable: 'reachable', user: 'user'): or allowing for more implicitness: class Process: @initializer def __init__(self, pid, ppid, cmd, fd: '_fd', reachable, user): Cheers, Chris -- http://blog.rebertia.com

For: def __init__(self, pid:pid, ppid:ppid, cmd:cmd, _fd:fd, reachable:reachable, user:user) This either conflicts with parameter annotations or you've got the annotation on the wrong side (and annotations are expressions so this won't work). I had a similar idea to what Chris Rebert suggested: from somewhere import auto_init @auto_init def __init__(self, pid, ppid, cmd, fd:[auto_init.private], reachable:[ auto_init.skip], user:[auto_init.name('user_name')]) blah The annotation auto_init.private is equivalent to auto_init.name('_'+* parameter_name*). Note that I wrote fd:[auto_init.private] instead of auto_init.private. One of the strange aspects (to me) of parameter annotations is that they have no semantics which opens them up to multiple conflicting uses. If we standardize on a convention that the annotation is a list (or tuple) of annotations, then this leads us to usage like foo:[auto_init.name('bar'),constraint.non_negative,etc]. --- Bruce *New! *Puzzazz newsletter: http://j.mp/puzzazz-news-2011-04 including April Fools! *New!** *Blog post: http://www.vroospeak.com Ironically, a glaring Google grammatical error

Bruce Leban wrote:
That's not a bug, that's a feature. It's been stated many times by Guido that it's far too early to standardize on a single meaning for annotations. (We may *never* standardize on a single meaning.) Instead, it is up to the library or decorator to impose whatever meaning makes sense for that particular library or decorator. -- Steven

On Sun, Apr 17, 2011 at 3:43 PM, Greg Ewing <greg.ewing@canterbury.ac.nz> wrote:
I have a use case- I use them to wrap C functions using ctypes in a semi-sane way. Old way: RAND_bytes = libraries['ssl'].RAND_bytes RAND_bytes.restype = ctypes.c_int RAND_bytes.argtypes = [ctypes.c_char_p, ctypes.c_int] My way: @C_function("ssl") def RAND_bytes(iv: c_char_p, iv_length: c_int) -> c_int: return RAND_bytes.c_function(iv, iv_length) IIRC, this came up during the initial discussion and wasn't widely loved, but for me (I do a reasonable amount of glue work) it makes life a lot simpler. Geremy Condra

Greg Ewing wrote:
The benefits of being BDFL :) But seriously, check the PEP: it's not written by Guido, and this feature is not driven by whim. http://www.python.org/dev/peps/pep-3107/ Annotations have at least one good use-case, and the semantics are perfectly defined: annotations are arbitrary expressions, and they get stored in the function object in a known place. What you do with those annotations is up to you. That's no different from other general processes in Python, like name binding, class attributes, and function calling, which have open semantics. ("Okay, I've created a variable. What do I do with it now?" That's entirely up to you, Python won't tell you what to do next.) The only difference is that those other processes are so well-known and have existed in some cases since the dawn of time (Fortran), and so we take them for granted. Annotations in the Python sense are new, and nobody knows what to do with them yet (except for the oh-so-predictable idea of type testing -- boring!). I think its a brave and innovative move. If it's not successful, that says more about the conservativeness of Python programmers than the usefulness of the feature. I bet Perl coders would have found some way to make their code even more incomprehensible with it by now *grin* More here on type checking in Python here: http://lambda-the-ultimate.org/node/1519 In particular note the links to Guido's essays thinking aloud, which eventually lead to annotations. -- Steven

On Sun, Apr 17, 2011 at 10:46 AM, Bruce Leban <bruce@leapyear.org> wrote:
The idea is for annotations to be paired with decorators that define the semantics. Yes, that does mean that they aren't composable - decorators need to provide an alternate initialisation mechanism for cases where the annotations are already being used for something else. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
participants (11)
-
Adam Matan
-
Arnaud Delobelle
-
Bruce Leban
-
Chris Rebert
-
dag.odenhall@gmail.com
-
Ethan Furman
-
geremy condra
-
Greg Ewing
-
Nathan Schneider
-
Nick Coghlan
-
Steven D'Aprano