AW: [Python-Dev] Constructor bug
Nick Coghlan
ncoghlan at iinet.net.au
Mon Nov 22 08:43:31 EST 2004
(cc'ing python-list so this explanation gets snagged by the python.org archives
and the Google-bot. It took me too long to write to leave it languishing in email ;)
(leading underscores in the examples are to persuade mailreaders to leave the
interactive prompts alone)
Oliver Walczak wrote:
> In my opinion, initialization outside of __init__ without self. and inside
> __init__ with self. should work absolutely identical.
They don't, though, because they actually say very different things. There are
two aspects of Python interacting here that explain the various behaviours you
are seeing:
1. Class attributes vs instance attributes
2. Immutable vs mutable objects.
Getting a good grasp on these two aspects can really help with mastering
Python's classes.
Consider your test class:
_>>>class test:
_... d1 = {'k':'v'} #1
_... i1 = 1 #2
_... def __init__(self):
_... self.d2 = {'k':'v'} #3
_... self.i2 = 1 #4
Here we have:
#1. Mutable class attribute (d1)
#2. Immutable class attribute (i1)
#3. Mutable instance attribute (d2)
#4. Immutable instance attribute (i2)
The class attributes can be accessed without creating an instance first:
_>>> test.d1
_{'k': 'v'}
_>>> test.i1
_1
But the instance attributes trigger exceptions:
_>>> test.d2
_Traceback (most recent call last):
_ File "<stdin>", line 1, in ?
_AttributeError: class test has no attribute 'd2'
_>>> test.i2
_Traceback (most recent call last):
_ File "<stdin>", line 1, in ?
_AttributeError: class test has no attribute 'i2'
If we create an instance, though, we can access all 4 attributes. When the
lookup of the class attributes in the instance dictionary fails, Python tries
the class dictionary and succeeds:
_>>> c1 = test()
_>>> c1.d1
_{'k': 'v'}
_>>> c1.i1
_1
_>>> c1.d2
_{'k': 'v'}
_>>> c1.i2
_1
The question of mutable vs immutable comes into play when we start changing
things. First, lets create c2:
_>>> c2 = test()
_>>> c2.d1
_{'k': 'v'}
_>>> c2.i1
_1
_>>> c2.d2
_{'k': 'v'}
_>>> c2.i2
_1
Looks identical to c1, right? Not quite:
_>>> c1.d1 is c2.d1 # Mutable class attribute
_True
_>>> c1.i1 is c2.i1 # Immutable class attribute
_True
_>>> c1.d2 is c2.d2 # Mutable instance attribute
_False
_>>> c1.i2 is c2.i2 # Immutable instance attribute
_True
Take note of the use of 'is' instead of '=='. '==' check if two objects have the
same *value* (e.g. "[] == []" is True). 'is' checks if they are the same
*object* (e.g. "[] is []" is False since they are two different empty lists, but
"a = []; a is a" is True, since a is obviously itself). We'll explore the
meaning of each of the above results as we go along.
For the class attributes, both d1 and i1 are the *same* object. That is, "c1.d1"
and "c2.d1" (and "test.d1" for that matter) are just different names for the
same dictionary living in the test class. Any changes made through any of the
references are going to be visible using the other references:
_>>> del test.d1['k']
_>>> test.d1
_{}
_>>> c1.d1
_{}
_>>> c2.d1
_>>> c1.d1.update({'k': 'v2'})
_>>> test.d1
_{'k': 'v2'}
_>>> c1.d1
_{'k': 'v2'}
_>>> c2.d1
_{'k': 'v2'}
But how do we change i1? We *can't*. Integers are immutable - that means we
cannot change them. The only thing we can do is make the reference to them point
to something different. When we alter an immutable class attribute, it matters
how we do it.
First, let's alter it directly in the class dictionary:
_>>> test.i1 = 3
_>>> test.i1
_3
_>>> c1.i1
_3
_>>> c2.i1
_3
That seemed to work the same as with d1. But let's try using an instance:
_>>> c1.i1 = 5
_>>> test.i1
_3
_>>> c1.i1
_5
_>>> c2.i1
_3
_>>> c1.i1 is c2.i1
_False
_>>> c1.i1 is test.i1
_False
_>>> c2.i1 is test.i1
_True
So what's going on here? When you try to set a class atribute using an instance,
the class attribute is not changed. Instead, the particular instance gets its
own instance variable that hides the version in the class. We can get the class
version back by deleting the added instance variable:
_>>> del c1.i1
_>>> c1.i1
_3
_>>> c1.i1 is test.i1
_True
This actually happens for mutable class attributes too, if we reassign them
instead of mutating them:
_>>> c2.d1 = {}
_>>> c2.d1
_{}
_>>> test.d1
_{'k': 'v2'}
_>>> c2.d1 is test.d1
_False
_>>> del c2.d1
_>>> c2.d1
_{'k': 'v2'}
_>>> c2.d1 is test.d1
_True
Hopefully the expected behaviour of the mutable instance attributes is now
clear. These are separate references to separate objects. Changes to either side
have no effect on the other:
_>>> c1.d2 is c2.d2
_False
_>>> c1.d2
_{'k': 'v'}
_>>> c2.d2
_{'k': 'v'}
_>>> del c1.d2['k']
_>>> c1.d2
_{}
_>>> c2.d2
_{'k': 'v'}
_>>> c2.d2 = 0
_>>> c1.d2
_{}
_>>> c2.d2
_0
So why do the immutable instance variables appear to be the same object?:
_>>> c1.i2 is c2.i2
_True
_>>> c1.i2
_1
_>>> c2.i2
_1
This is possible because they are *immutable*. Implementations of immutable
objects often go by the rule "Don't create a new object if an old one will do".
In this case, the Python.org interpreter's integer implementation follows that
rule for small integers - it keeps them around, and only has one copy of each
value. Since c1.i2 and c2.i2 both refer to the value 1, this implementation
means they are actually referring to the same object.
We can, however, still alter the value each instance refers to without affecting
the other instance:
_>>> c1.i2 = 3
_>>> c1.i2
_3
_>>> c2.i2
_1
_>>> c1.i2 is c2.i2
_False
You made it this far? That's a pretty good effort in itself. You can find the
official word on Python classes in Section 9 of the Python Tutorial
(http://www.python.org/doc/2.3.4/tut/node11.html).
Rest assured that the current behaviour really is the way the developers intend
Python to work. It can be a little confusing initially, but is rather useful
once you get the hang of it.
Cheers,
Nick.
--
Nick Coghlan | Brisbane, Australia
Email: ncoghlan at email.com | Mobile: +61 409 573 268
More information about the Python-list
mailing list