[Tutor] Discussion of super

Skipper Seabold jsseabold at gmail.com
Sun Jun 14 17:25:59 CEST 2009


Hello all,

I am working under the umbrella of the Python Software Foundation for
the Google Summer of Code and am keeping a blog about the work.  Part
of my work is refactoring and extending some existing code.  This code
makes use of Python's super, and so I am trying to understand the ins
and outs of super.  I recently summarized my understanding of super on
my blog, and I was wondering if anyone would care to comment on my
write up.  I am particularly worried about my handling of my multiple
inheritance example and explanation of class F.

I apologize for the length and if this is an inappropriate request for
this list, but I read the discussions here almost daily, and often the
points that come up add to my Python understanding very much.
Everyone seems to have a great grasp of Python and programming and
that rare ability (and willingness) to *explain* topics.

The post is here
<http://scipystats.blogspot.com/2009/06/design-issues-understanding-pythons.html>

But if it's easier for the ML, comments can be inlined below.  The
In/Out prompts are the iPython interpreter if anyone is wondering, and
I tried to preserve indents.

Cheers,

Skipper

--------------------------------------------------------------------------------------------------------------

The current statistical models package is housed in the NiPy,
Neuroimaging in Python, project. Right now, it is designed to rely on
Python's built-in super to handle class inheritance. This post will
dig a little more into the super function and what it means for the
design of the project and future extensions. Note that there are
plenty of good places to learn about super and that this post is to
help me as much as anyone else. You can find the documentation for
super here. If this is a bit confusing, it will, I hope, become
clearer after I demonstrate the usage.

First, let's take a look at how super actually works for the simple
case of single inheritance (right now, we are not planning on using
multiple inheritance in the project) and an __init__ chain (note that
super can call any of its parent class's methods, but using __init__
is my current use case).

The following examples were adapted from some code provided by mentors
(thank you!).

class A(object):
    def __init__(self, a):
        self.a = a
        print 'executing A().__init__'

class B(A):
    def __init__(self, a):
        self.ab = a*2
        print 'executing B().__init__'
        super(B,self).__init__(a)

class C(B):
    def __init__(self, a):
        self.ac = a*3
        print 'executing C().__init__'
        super(C,self).__init__(a)

Now let's have a look at creating an instance of C.

In [2]: cinst = C(10)
executing C().__init__
executing B().__init__
executing A().__init__

In [3]: vars(cinst)
Out[3]: {'a': 10, 'ab': 20, 'ac': 30}

That seems simple enough. Creating an instance of C with a = 10 will
also give C the attributes of B(10) and A(10). This means our one
instance of C has three attributes: cinst.ac, cinst.ab, cinst.a. The
latter two were created by its parent classes (or superclasses)
__init__ method. Note that A is also a new-style class. It subclasses
the 'object' type.

The actual calls to super pass the generic class 'C' and a handle to
that class 'self', which is 'cinst' in our case. Super returns the
literal parent of the class instance C since we passed it 'self'. It
should be noted that A and B were created when we initialized cinst
and are, therefore, 'bound' class objects (bound to cinst in memory
through the actual instance of class C) and not referring to the class
A and class B instructions defined at the interpreter (assuming you
are typing along at the interpreter).

Okay now let's define a few more classes to look briefly at multiple
inheritance.

class D(A):
    def __init__(self, a):
        self.ad = a*4
        print 'executing D().__init__'
        # if super is commented out then __init__ chain ends
        #super(D,self).__init__(a)

class E(D):
    def __init__(self, a):
        self.ae = a*5
        print 'executing E().__init__'
        super(E,self).__init__(a)

Note that the call to super in D is commented out. This breaks the
__init__ chain.

In [4]: einst = E(10)
executing E().__init__
executing D().__init__

In [5]: vars(einst)
Out[5]: {'ad': 40, 'ae': 50}

If we uncomment the super in D, we get as we would expect

In [6]: einst = E(10)
executing E().__init__
executing D().__init__
executing A().__init__

In [7]: vars(einst)
Out[7]: {'a': 10, 'ad': 40, 'ae': 50}

Ok that's pretty straightforward. In this way super is used to pass
off something to its parent class. For instance, say we have a little
more realistic example and the instance of C takes some timeseries
data that exhibits serial correlation. Then we can have C correct for
the covariance structure of the data and "pass it up" to B where B can
then perform OLS on our data now that it meets the assumptions of OLS.
Further B can pass this data to A and return some descriptive
statistics for our data. But remember these are 'bound' class objects,
so they're all attributes to our original instance of C. Neat huh?
Okay, now let's look at a pretty simple example of multiple
inheritance.

class F(C,E):
    def __init__(self, a):
        self.af = a*6
        print 'executing F().__init__'
        super(F,self).__init__(a)

For this example we are using the class of D, that has super commented out.

In [8]: finst = F(10)
executing F().__init__
executing C().__init__
executing B().__init__
executing E().__init__
executing D().__init__

In [8]: vars(finst)
Out[8]: {'ab': 20, 'ac': 30, 'ad': 40, 'ae': 50, 'af': 60}

The first time I saw this gave me pause. Why isn't there an finst.a? I
was expecting the MRO to go F -> C -> B -> A -> E -> D -> A. Let's
take a closer look. The F class has multiple inheritance. It inherits
from both C and E. We can see F's method resolution order by doing

In [9]: F.__mro__
Out[9]:
(<class '__main__.F'>,
<class '__main__.C'>,
<class '__main__.B'>,
<class '__main__.E'>,
<class '__main__.D'>,
<class '__main__.A'>,
<type 'object'>)

Okay, so we can see that for F A is a subclass of D but not B. But why?

In [10]: A.__subclasses__()
Out[10]: [<class '__main__.B'>, <class '__main__.D'>]

The reason is that A does not have a call to super, so the chain
doesn't exist here. When you instantiate F, the hierarchy goes F -> C
-> B -> E -> D -> A. The reason that it goes from B -> E is because A
does not have a call to super, so it can't pass anything to E (It
couldn't pass anything to E because the object.__init__ doesn't take a
parameter "a" and because you cannot have a MRO F -> C -> B -> A -> E
-> D -> A as this is inconsistent and will give an error!), so A does
not cause a problem and the chain ends after D (remember that D's
super is commented out, but if it were not then there would be finst.a
= 10 as expected). Whew.

I'm sure you're thinking "Oh that's (relatively) easy. I'm ready to go
crazy with super." But there are a number of things must keep in mind
when using super, which makes it necessary for the users of super to
proceed carefully.

1. super() only works with new-style classes. You can read more about
classic/old-style vs new-style classes here. From there you can click
through or just go here for more information on new-style classes.
Therefore, you must know that the base classes are new-style. This
isn't a problem for our project right now, because I have access to
all of the base classes.

2. Subclasses must use super if their superclasses do. This is why the
user of super must be well-documented. If we have to classes A and B
that both use super and a class C that inherits from them, but does
not know about super then we will have a problem. Consider the
slightly different case

class A(object):
    def __init__(self):
        print "executing A().__init__"
        super(A, self).__init__()

class B(object):
    def __init__(self):
        print "executing B().__init__"
        super(B, self).__init__()

class C(A,B):
    def __init__(self):
        print "executing C().__init__"
        A.__init__(self)
        B.__init__(self)
        # super(C, self).__init__()

Say class C was defined by someone who couldn't see class A and B,
then they wouldn't know about super. Now if we do

In [11]: C.__mro__
Out[11]:
(<class '__main__.C'>,
<class '__main__.A'>,
<class '__main__.B'>,
<type 'object'>)

In [12]: c = C()
executing C().__init__
executing A().__init__
executing B().__init__
executing B().__init__


B got called twice, but by now this should be expected. A's super
calls __init__ on the next object in the MRO which is B (it works this
time unlike above because there is no parameter passed with __init__),
then C explicitly calls B again.

If we uncomment super and comment out the calls to the parent __init__
methods in C then this works as expected.

3. Superclasses probably should use super if their subclasses do.

We saw this earlier with class D's super call commented out. Note also
that A does not have a call to super. The last class in the MRO does
not need super *if* there is only one such class at the end.

4. Classes must have the exact same call signature.

This should be obvious but is important for people to be able to
subclass. It is possible however for subclasses to add additional
arguments so *args and **kwargs should be probably always be included
in the methods that are accessible to subclasses.

5. Because of these last three points, the use of super must be
explicitly documented, as it has become a part of the interface to our
classes.


More information about the Tutor mailing list