[Tutor] class variables
Steven D'Aprano
steve at pearwood.info
Sat Dec 21 08:14:32 CET 2013
On Fri, Dec 20, 2013 at 02:04:49AM -0500, Keith Winston wrote:
> I am a little confused about class variables: I feel like I've repeatedly
> seen statements like this:
I don't like the terms "class variable" and "instance variable". In the
Python community, these are usually called class and instance attributes
rather than variables or members.
(Sometimes, people will call them "members", especially if they are used
to C#. The meaning here is member as in an arm or leg, as in
"dismember", not member in the sense of belonging to a group.)
Normally, we say that a string variable is a variable holding a string,
a float variable is a variable holding a float, an integer variable is a
variable holding an integer. So a class variable ought to be a variable
holding a class, and an instance variable ought to be a variable holding
an instance. In Python we can have both of those things!
Unlike Java, classes are "first-class citizens" and can be treated
exactly the same as strings, floats, ints and other values. So a "class
variable" would be something like this:
for C in list_of_classes:
# Here, C holds a class, and so we might call
# it a "class variable", not a string variable
do_something_with(variable)
> There is only one copy of the class variable and when any one object makes a
> change to a class variable, that change will be seen by all the other
> instances.
> Object variables are owned by each individual object/instance of the class.
> In this case, each object has its own copy
Talking about copies is not a good way to understand this. It might make
sense to talk about copies in some other languages, but not in Python.
(Or any of many languages with similar behaviour, like Ruby or Java.)
I'm going to give you a simple example demonstrating why thinking about
copies is completely the wrong thing to do here. If you already
understand why "copies" is wrong, you can skip ahead here, but otherwise
you need to understand this even though it doesn't directly answer your
question.
Given a simple class, we can set an attribute on a couple of instances
and see what happens. Copy and paste these lines into a Python
interactive session, and see if you can guess what output the print will
give:
class Test:
pass
spam = Test()
eggs = Test()
obj = []
spam.attribute = obj
eggs.attribute = obj
spam.attribute.append("Surprise!")
print(eggs.attribute)
If you think about *copies*, you might think that spam and eggs have
their own independent copies of the empty list. But that's not what
Python does. You don't have two copies of the list, you have a single
list, and two independent references to it. (Actually, there are three:
obj, spam.attribute, eggs.attribute.) But only one list, with three
different names.
This is similar to people. For instance, the President of the USA is
known as "Mr President" to his staff, "POTUS" to the military, "Barrack"
to his wife Michelle, "Mr Obama" to historians and journalists, "Dad" to
his children, and so forth. But they all refer to the same person. In a
few years, Barrack Obama will stand down as president, and somebody else
will be known as "Mr President" and "POTUS", but he'll still be
"Barrack" to Michelle.
Python treats objects exactly the same. You can have lots of names for
the same object. Some objects, like lists, can be modified in place.
Other objects, like strings and ints, cannot be.
In Python, we refer to this system as "name binding". You have things
which are names, like "obj", and we associate an object to that name.
Another term for this is a "reference", in the generic sense that we
"refer" to things.
So we can bind an object to a name:
obj = []
We can *unbind* the name as well:
del obj
In Python, assignment with = is name binding, and not copying:
spam.attribute = obj
does not make a copy of the list, it just makes "spam.attribute" and
"obj" two different names for the same list. And likewise for
"eggs.attribute".
Hopefully now you can understand why it is wrong to talk about "copies"
here. In Python, you only get copies when you explicitly call a function
which makes a copy, and never from = assignment (name binding).
Now let me get back to your original question:
> But when I test, I see some interesting things: first (and this is
> consistent with above) the class variables are created when the class is
> defined, and can be used even without any instances of the class being
> created.
Correct. Not only that, but class attributes will show up from instances
as well:
py> class Parrot:
... colour = "green"
... def description(self):
... return "You see a %s coloured bird." % self.colour
...
py> polly = Parrot()
py> polly.description()
'You see a green coloured bird.'
> Second, initially confusing but maybe I understand... there are pointers to
> the class variables associated with every instance of the object,
Don't think about pointers. That's too specific. It just so happens that
the version of Python you are using *does* use pointers under the hood,
but that's not always the case. For instance, Jython is written in Java,
and IronPython is written in dot-Net's CLR. Neither of those languages
have pointers, but they have *something* that will do the same job as a
pointer.
This is why we talk about references. The nature of the reference
remains the same no matter what version of Python you use, regardless of
how it works under the hood.
Putting aside that, you're actually mistaken here about there being an
association between the instance and class attribute. There is no
association between the instance and the class attribute. (Or rather, no
*direct* association. Of course there is an indirect association.) What
actually happens is something rather like this:
Suppose we ask Python for "polly.colour". Python looks at the instance
polly, and checks to see if it has an instance attribute called "polly".
If it does, we're done. But if it doesn't, Python doesn't give up
straight away, it next checks the class of polly, which is Parrot. Does
Parrot have an attribute called "polly"? Yes it does, so that gets
returned.
The actual process is quite complicated, but to drastically
over-simplify, Python will check:
- the instance
- the class
- any super-classes of the class
and only raise an exception if none of these have an attribute of the
right name.
> but if I
> assign THOSE variables new values, it crerates new, "local"/instance
> variables.
When you ask for the polly.colour attribute, Python will search the
instance, the class, and any super-classes for a match. What happens
when you try to assign an attribute?
py> polly.colour = 'red'
py> polly.description()
'You see a red coloured bird.'
py> Parrot.colour
'green'
The assignment has created a new name-binding, creating the instance
attribute "colour" which is specific to that one instance, polly. The
class attribute remains untouched, as would any other instances (if we
had any). No copies are made.
So unlike *getting* an attribute, which searches both the instance
and the class, *setting* or *deleting* an attribute stops at the
instance.
I like to describe this as class attributes are *shared*. Unless
shadowed by an instance attribute of the same name, a class attribute is
seen by all instances and its content is shared by all. Instance
attributes, on the other hand, are distinct.
> So:
> Class.pi == 3.14 # defined/set in the class def
> instance.pi == 3.14 # initially
> instance.pi = 4 # oops, changed it
> Class.pi == 3.14 # still
> Class.pi = "rhubarb" # oops, there I go again
> instance.pi == 4 # still
>
> Sorry if I'm beating this to a pulp, I think I've got it... I'm just
> confused because the way they are described feels a little confusing, but
> maybe that's because I'm not taking into account how easy it is to create
> local variables...
Local variables are a whole different thing again. Another reason why I
dislike the habit of calling these things "instance variables", borrowed
from languages like Java.
--
Steven
More information about the Tutor
mailing list