[Tutor] What's the correct way to define/access methods of a member variable in a class pointing to an object?
Steven D'Aprano
steve at pearwood.info
Sat Sep 3 07:16:18 EDT 2016
On Sat, Sep 03, 2016 at 11:25:07AM +0530, Sharad Singla wrote:
> Hi Pythonistas
>
> What's the correct way to define/access methods of a member variable in a
> class pointing to an object?
Python recommends that you start with the simplest thing that will work
first, which is direct attribute access, and only do something more
complex if and when you need to. Python makes it easy to change the
implementation without changing the interface! See more comments below.
> For example, I have a class Foo that has a method foo_method:
>
> class Foo:
[snip implementation of class Foo]
> Now, in another class Bar, I'd like to store an object to this class (I do
> not want Bar to inherit Foo).
>
> What is the correct/recommended way to define class Bar?
>
> class Bar:
> def __init__(self):
> self.foo = Foo()
Usually this one, with a few exceptions.
> The former will allow me to use:
>
> x = Bar()
> x.foo.foo_method()
Correct.
> But, with this I'm directly accessing methods of a member variable (strong
> coupling).
This is true, but the warning against coupling is a little more subtle
than just "don't do it".
The problem is that often you have to duplicate the interface to avoid
coupling the *interface* of Bar class to Foo class. Imagine if Foo has,
not one method, but fifty:
class Bar:
def __init__(self):
self._foo = Foo() # Keep it private.
def bar_method(self):
return self._foo.bar_method()
def baz_method(self):
return self._foo.baz_method()
def bing_method(self):
return self._foo.bing_method()
def spam_method(self):
return self._foo.spam_method()
def eggs_method(self):
return self._foo.eggs_method()
def cheese_method(self):
return self._foo.cheese_method()
... and so on, for 44 more duplicate methods. At this point, you should
ask yourself: what is Bar class adding? It's just a middle-man, which
adds complexity to your code and run-time inefficiency.
(In Java, middle-man classes still add complexity, but they don't add
run-time ineffeciency. The Java compiler can resolve a long chain of dot
accesses like:
instance.foo.bar.baz.foobar.fe.fi.fo.fum.spam.eggs.cheese.ardvaark_method()
into a fast and efficient call direct to ardvaark_method(), but in
Python every one of those dots has to be resolved at runtime. You
really don't want enormously long chains of dot method calls in
Python! A few dots is fine, but don't write Java style.)
So middle-man classes have a cost, and they have to pay their way. If
they don't pay their way, it is better to dump them, and just work with
Foo directly. The same goes for attribute access: you have to weigh up
the added complexity of hiding the foo attribute against the benefit
gained, and only hide it behind getter and setter methods if you really
need to.
Python also takes the philosophy that public attributes are usually a
good thing. You will often see classes where an attribute is a dict, or
a list. For example, in Python 3, there is a class collections.ChainMap
which stores a list of dicts. Rather than doing something like this:
class ChainMap:
def list_sort(self):
self._maps.sort()
def list_reverse(self):
self._maps.reverse()
def list_get(self, i):
return self._maps[i]
# etc.
the class just exposes the list directly to the caller. Its a list. You
know how to sort lists, and extract items. There's no need to hide it.
The interface will never change, it will always be a list. Just access
the "maps" attribute, and work with it directly.
But generally this applies to relatively simple objects with well-known
interfaces, like lists, dicts, and so forth. You might find you have
good reason to hide Foo behind a middle-man. Perhaps Foo has a complex,
ugly, un-Pythonic interface and you want to provide a more pleasant
interface. Or perhaps you want the freedom to change the Foo interface
at will, without changing the Bar interface.
So while it is *usual* to just do direct attribute access in Python, it
is not forbidden to do it the other way. Just make sure you have a good
reason, better than "because that's what my Comp Sci 101 professor told
me to always do".
> The benefits I see with this approach are:
>
> - I don't have to wrap every new method that gets added to class Foo.
> - Bar may contain more member variables pointing to other classes. Not
> wrapping them keeps Bar smaller and manageable in size.
> - The auto-completion facility from IDE (PyCharm, etc.) or IPython helps
> inspect bar like a menu (x.foo) followed by a sub-menu (
> x.foo.foo_method(), x.bar.foobar(), etc.) making it easier to develop
> code.
> - Functional programming look-n-feel (not sure if this a pro or con)
Exactly. That's why Python programmers tend to expose attributes as part
of the public interface by default, and only hide them if justified by
other concerns.
> The cons are strong coupling, not encapsulating internal details of foo,
> etc.
Right. And sometimes you really do want to encapsulate the internal
details of Foo.
In Java, programmers learn to hide attributes by default, because if you
expose them and then decide you need to hide the internal details of
Foo, you are screwed, you can't do it. But Python lets you do it.
Suppose we stick to the initial plan, and just say:
class Bar(object):
def __init__(self):
self.foo = Foo() # Make it public.
so that the user can say bar.foo.bar_method(). But then some day you
realise that you need to change the Foo interface. Perhaps bar_method
has to take a mandatory argument where before it took none. How can we
change Foo without changing Bar, now that we exposed it? In Java, you
can't, or at least not easily, hence Java programmers learn to always
hide everything by default. But in Python, properties to the rescue!
Properties only work in "new-style" classes. In Python 3, you don't have
to change anything, but in Python 2 we need to inherit from object. So
let's start by inheriting from object (which you should be doing
anyway), and changing our *public* foo to a *private* foo:
class Bar(object):
def __init__(self):
self._foo = Foo() # Make it private.
Now let's add a "foo" property.
@property
def foo(self):
return FooProxy(self._foo)
Now we need a simple FooProxy class to wrap the real Foo, hiding the
bits we want hidden and exposing the bits we can:
class FooProxy(object):
def __init__(self, foo):
self.foo = foo
def bar_method(self):
# bar_method used to take no arguments.
return self.foo.bar_method("some", "arguments")
# Any other methods that we have to change the implementation?
def baz_method(self):
# Fix a bug in the class.
try:
return self.foo.baz_method()
except SomeError:
return "baz baz baz"
# For everything else, delegate to the real foo.
def __getattr__(self, name):
return getattr(self.foo, name)
def __setattr__(self, name, value):
setattr(self.foo, name, value)
def __delattr__(self, name):
delattr(self.foo, name)
And so with a small amount of extra work, and a little bit of runtime
cost, we have full information hiding of Foo, but as far as the caller
is concerned, it is as if we have exposed the Foo instance directly. The
caller simply writes:
instance = Bar()
instance.foo.bar_method()
exactly the same as before.
(P.S. I haven't tested the above code, it is possible there may be minor
errors or typos.)
But you should only do this if you are sure you need to. Otherwise, just
use self.foo = Foo(). You can always add the extra layer of indirect
code later, when you need it.
--
Steve
More information about the Tutor
mailing list