Why do class methods always need 'self' as the first parameter?

Wed Aug 31 11:41:27 EDT 2011

T. Goodchild wrote:

> So why is 'self' necessary on class methods?  

I assume you are talking about the declaration in the method signature:

def method(self, args): ...

rather than why methods have to be called using self.method. If not, there's
already a FAQ for that second question:

http://docs.python.org/faq/design.html#why-self

> It seems to me that the 
> most common practice is that class methods *almost always* operate on
> the instance that called them.

By the way, what you're calling "class methods" are actually *instance*
methods, because they receive the instance "self" as the first parameter.

Python does have class methods, which receive the class, not the instance,
as the first parameter. These are usually written something like this:

class K(object):
    @classmethod
    def spam(cls, args):
        print cls  # always prints "class K", never the instance

Just like self, the name cls is a convention only. Class methods are usually
used for alternate constructors.

There are also static methods, which don't receive any special first
argument, plus any other sort of method you can invent, by creating
descriptors... but that's getting into fairly advanced territory. They're
generally specialised, and don't see much use.

As you can see, the terminology is not exactly the same as Java.

> It would make more sense to me if this 
> was assumed by default, ...

Well here's the thing. Python methods are wrappers around function objects.
The method wrapper knows which instance is involved (because of the
descriptor magic which I alluded to above), but the function doesn't and
can't. Or at least not without horrible run-time hacks.

By treating "self" as an ordinary parameter which needs to be declared, you
can do cool stuff like bound and unbound methods:

f = "instance".upper  # this is a bound method
g = str.upper  # this is an unbound method

The bound method f already has the instance "self" filled in, so to speak.
So you can now just call it, and it will work:

f()
=> returns "INSTANCE"

The unbound method still needs the instance supplied. This makes it perfect
for code like this:

for instance in ("hello", "world"):
    print g(instance)

especially if you don't know what g will be until run-time. (E.g. will it be
str.upper, str.lower, str.title?)

Because methods require that first argument to be given explicitly, unbound
methods are practically ordinary functions. They're so like functions that
in Python 3, they're done away with altogether, and the unwrapped function
object will be returned instead.

You can also do nifty stuff like dynamic method injections:

>>> def func(a, b):
...     print(a, b)
...
>>> class K(object):
...     pass
...
>>> K.func = func  # dynamically inject a method
>>> instance = K()
>>> instance.func(23)
(<__main__.K object at 0xb7f0a4cc>, 23)

and it all just works. You can even inject a method onto the instance,
although it takes a bit more effort to make that work.

All this is possible without nasty hacks because self is treated as just an
ordinary parameter of functions. Otherwise, the compiler would need to know
whether the function was being called from inside a method wrapper or not,
and change the function signature appropriately, and that just gets too
ugly and messy for words.

So for the cost of having to declare self as an argument, we get:

* instant visual recognition of what's intended as a method ("the 
  first argument is called self") and what isn't
* a nicely consistent treatment of function signatures at all times
* clean semantics for the local variable namespace
* the same mechanism (with minor adjustments) can be used for class 
  and static methods
* bound and unbound methods semantics

plus as a bonus, plenty of ongoing arguments about whether or not having to
explicitly list "self" as a parameter is a good thing or not, thus keeping
people busy arguing on mailing lists instead of coding

<wink>

-- 
Steven