[Tutor] What's the correct way to define/access methods of a member variable in a class pointing to an object?

Sat Sep 3 07:16:18 EDT 2016

On Sat, Sep 03, 2016 at 11:25:07AM +0530, Sharad Singla wrote:
> Hi Pythonistas
> 
> What's the correct way to define/access methods of a member variable in a
> class pointing to an object?

Python recommends that you start with the simplest thing that will work 
first, which is direct attribute access, and only do something more 
complex if and when you need to. Python makes it easy to change the 
implementation without changing the interface! See more comments below.

> For example, I have a class Foo that has a method foo_method:
> 
> class Foo:
[snip implementation of class Foo]

> Now, in another class Bar, I'd like to store an object to this class (I do
> not want Bar to inherit Foo).
> 
> What is the correct/recommended way to define class Bar?
> 
> class Bar:
>     def __init__(self):
>         self.foo = Foo()

Usually this one, with a few exceptions.

> The former will allow me to use:
> 
> x = Bar()
> x.foo.foo_method()

Correct.

> But, with this I'm directly accessing methods of a member variable (strong
> coupling).

This is true, but the warning against coupling is a little more subtle 
than just "don't do it". 

The problem is that often you have to duplicate the interface to avoid 
coupling the *interface* of Bar class to Foo class. Imagine if Foo has, 
not one method, but fifty:

class Bar:
    def __init__(self):
        self._foo = Foo()  # Keep it private.

    def bar_method(self):
        return self._foo.bar_method()

    def baz_method(self):
        return self._foo.baz_method()

    def bing_method(self):
        return self._foo.bing_method()

    def spam_method(self):
        return self._foo.spam_method()

    def eggs_method(self):
        return self._foo.eggs_method()

    def cheese_method(self):
        return self._foo.cheese_method()

... and so on, for 44 more duplicate methods. At this point, you should 
ask yourself: what is Bar class adding? It's just a middle-man, which 
adds complexity to your code and run-time inefficiency.

(In Java, middle-man classes still add complexity, but they don't add 
run-time ineffeciency. The Java compiler can resolve a long chain of dot 
accesses like:

instance.foo.bar.baz.foobar.fe.fi.fo.fum.spam.eggs.cheese.ardvaark_method()

into a fast and efficient call direct to ardvaark_method(), but in 
Python every one of those dots has to be resolved at runtime. You 
really don't want enormously long chains of dot method calls in 
Python! A few dots is fine, but don't write Java style.)

So middle-man classes have a cost, and they have to pay their way. If 
they don't pay their way, it is better to dump them, and just work with 
Foo directly. The same goes for attribute access: you have to weigh up 
the added complexity of hiding the foo attribute against the benefit 
gained, and only hide it behind getter and setter methods if you really 
need to.

Python also takes the philosophy that public attributes are usually a 
good thing. You will often see classes where an attribute is a dict, or 
a list. For example, in Python 3, there is a class collections.ChainMap 
which stores a list of dicts. Rather than doing something like this:

class ChainMap:
    def list_sort(self):
        self._maps.sort()
    def list_reverse(self):
        self._maps.reverse()
    def list_get(self, i):
        return self._maps[i]
    # etc.

the class just exposes the list directly to the caller. Its a list. You 
know how to sort lists, and extract items. There's no need to hide it. 
The interface will never change, it will always be a list. Just access 
the "maps" attribute, and work with it directly.

But generally this applies to relatively simple objects with well-known 
interfaces, like lists, dicts, and so forth. You might find you have 
good reason to hide Foo behind a middle-man. Perhaps Foo has a complex, 
ugly, un-Pythonic interface and you want to provide a more pleasant 
interface. Or perhaps you want the freedom to change the Foo interface 
at will, without changing the Bar interface.

So while it is *usual* to just do direct attribute access in Python, it 
is not forbidden to do it the other way. Just make sure you have a good 
reason, better than "because that's what my Comp Sci 101 professor told 
me to always do".

> The benefits I see with this approach are:
> 
>    - I don't have to wrap every new method that gets added to class Foo.
>    - Bar may contain more member variables pointing to other classes. Not
>    wrapping them keeps Bar smaller and manageable in size.
>    - The auto-completion facility from IDE (PyCharm, etc.) or IPython helps
>    inspect bar like a menu (x.foo) followed by a sub-menu (
>    x.foo.foo_method(), x.bar.foobar(), etc.) making it easier to develop
>    code.
>    - Functional programming look-n-feel (not sure if this a pro or con)

Exactly. That's why Python programmers tend to expose attributes as part 
of the public interface by default, and only hide them if justified by 
other concerns.

> The cons are strong coupling, not encapsulating internal details of foo,
> etc.

Right. And sometimes you really do want to encapsulate the internal 
details of Foo.

In Java, programmers learn to hide attributes by default, because if you 
expose them and then decide you need to hide the internal details of 
Foo, you are screwed, you can't do it. But Python lets you do it.

Suppose we stick to the initial plan, and just say:

class Bar(object):
    def __init__(self):
        self.foo = Foo()  # Make it public.

so that the user can say bar.foo.bar_method(). But then some day you 
realise that you need to change the Foo interface. Perhaps bar_method 
has to take a mandatory argument where before it took none. How can we 
change Foo without changing Bar, now that we exposed it? In Java, you 
can't, or at least not easily, hence Java programmers learn to always 
hide everything by default. But in Python, properties to the rescue!

Properties only work in "new-style" classes. In Python 3, you don't have 
to change anything, but in Python 2 we need to inherit from object. So 
let's start by inheriting from object (which you should be doing 
anyway), and changing our *public* foo to a *private* foo:

class Bar(object):
    def __init__(self):
        self._foo = Foo()  # Make it private.

Now let's add a "foo" property.

    @property
    def foo(self):
        return FooProxy(self._foo)

Now we need a simple FooProxy class to wrap the real Foo, hiding the 
bits we want hidden and exposing the bits we can:

class FooProxy(object):
    def __init__(self, foo):
        self.foo = foo

    def bar_method(self):  
        # bar_method used to take no arguments.
        return self.foo.bar_method("some", "arguments")

    # Any other methods that we have to change the implementation?

    def baz_method(self):
        # Fix a bug in the class.
        try:
            return self.foo.baz_method()
        except SomeError:
            return "baz baz baz"

    # For everything else, delegate to the real foo.

    def __getattr__(self, name):
        return getattr(self.foo, name)

    def __setattr__(self, name, value):
        setattr(self.foo, name, value)

    def __delattr__(self, name):
        delattr(self.foo, name)

And so with a small amount of extra work, and a little bit of runtime 
cost, we have full information hiding of Foo, but as far as the caller 
is concerned, it is as if we have exposed the Foo instance directly. The 
caller simply writes:

instance = Bar()
instance.foo.bar_method()

exactly the same as before.

(P.S. I haven't tested the above code, it is possible there may be minor 
errors or typos.)

But you should only do this if you are sure you need to. Otherwise, just 
use self.foo = Foo(). You can always add the extra layer of indirect 
code later, when you need it.

-- 
Steve