More than you ever wanted to know about objects [was: Is everything a refrence or isn't it]
Steve Holden
steve at holdenweb.com
Thu Jan 12 22:12:00 EST 2006
rurpy at yahoo.com wrote:
> "Fredrik Lundh" <fredrik at pythonware.com> wrote in message
[...]
>>
>>you really have trouble with abstract concepts, don't you?
>
>
> Usually not when they are explained well.
>
You want to be careful - the bot will get you! Don't be rude to the bot!
>
>>*what* the value is is defined by the operations that the object supports (via its
>>type).
>
>
> Well, that is already better than what is in the Lang Ref.
> But there must be more to it than that. int(1) and int(2)
> have exactly the same operations, yes? Yet their values
> are different. (By "operations" you mean the set of methods
> an object has?)
>
He does. But "the value of an object" is a bit like saying "the value of
a car": a car is a complex object, with properties (number of doors, for
example, which can differ from car to car) and operations (like "start
engine", "accelerate", "brake" and so on).
> This interest is not just idle mental calesthenics.
> I wanted to write a function that would dump the contents
> of any object (value and attributes), and got rather confused
> about values, types, repr's, etc.
>
Well it's an ambitious project, but not an impossible one. Python has
features for "introspection", which means that programs can examine the
structure of the data rather than only dealing with values of known
structure.
> It would help if you or someone would answer these
> five questions (with something more than "yes" or "no" :-)
>
Well I've been staying away from this thread, but I'll try. For
demagogic purposes I am omitting consideration of what are usually
called "classic classes". You can talk about them once you understand
the situation in more detail.
Note to nitpickers
------------------
Please note that I *am* oversimplifying here, and the nitpickers will
undoubtedly find many threadsworth of valuable material here. The point
is to develop an understanding of the *principles*. Once someone has
that they can pick up the details as they need them, which 98% of Python
users don't feel the need to. I don't even claim to know the whole
story, but I *do* know enough to explain it in principle. So by all
means correct me where I am demonstrably wring, but *please* don't just
add complexity that doesn't aid understanding.
So sit comfortably, and we'll learn about object-oriented systems.
> 1. Do all objects have values?
All objects have a namespace. Each item in the namespace (attribute) is
identified by (guess what) a name, and refers to a value. The value is
also an object (which is why people sometimes say "in Python everything
is an object"). You might even argue that an object is nothing *but* a
namespace with specific behaviour, though that's a somewhat extreme view.
Each object is an instance of some type, and "knows" which type it's an
instance of. (A reference to) the type is usually stored in the
instance's attribute named (for historical reasons) __class__. Once you
get really smart you'll learn that you can even the type of some objects
(the ones whose types are defined in Python rather than being built into
the interpreter) by changing the value of their __class__ attribute.
The instance's type is *also* an object, and therefore has its own
namespace. The instance (of the type) is created by calling the type,
possibly with arguments that can be used to initialise the values of
items in the instance's namespace. They can also be ignored, but there
wouldn't be much point requiring them then, would there?
> 2. What is the value of object()?
Well, object is a type. So calling it give you an instance of type
object. But of course you could have answered this question for yourself
in the interactive interpreter:
>>> object()
<object object at 0x0099C438>
>>>
Pay attention here:
>>> object()
<object object at 0x0099C438>
No, I didn't just copy and paste the same text. I'm using the C Python
implementation, and the first time I created an instance of type object
I didn't bind it to a name. C Python keeps a count of references to all
objects, and when the reference count falls to zero it reclaims the
space. This was promptly re-used to create the second instance.
If I bind a name to a third object (which will *also* be created at the
same address) the memory won't be reused:
>>> a = object() # sets the ref count of the instance to 1
>>> object()
<object object at 0x0099C448>
>>>
> 3. If two objects are equal with "==", does that
> mean their values are the same?
Almost universally, yes, although if you know enough about how the
interpreter works "under the hood" you can define the response of
instances of your own classes to the "==" operator (by defining their
__eq__ method), and even define a class whose instances aren't equal to
anything, even to themselves!
> 4. Are object attributes part of an object's type
> or it's value, or something else? (I think the first.)
Well, this is where things become a little more complex.
Each instance of a type has some attributes defined in its own
namespace ("instance attributes"), which are therefore unique to it.
These attributes would normally be considered collectively as the
"value" of an object if you really *must* have something that you can
call the value. The attributes of an instance are accessible only by
referencing the instance and then qualifying that reference - usually
using the dot operator, but sometimes using functions like getattr().
You can, however, also use the instance to access the attributes of its
*type*, and perhaps this is what is confusing you. Because these type
attributes can be referenced via *any* instance of the type they are
considered common to all instances of a given type. Again for historical
reasons they are commonly called "class attributes".
The normal situation is that the instance attributes are used as the
instance's value, and the class attributes are used as the methods of
each instance of the type. This is why we say the type of an instance
defines its behaviour.
When you write
foo = instance.method(arg1, arg2)
you are almost always calling a function defined inside the instance
type's class definition (though this being Python there's absolutely
nothing to stop instances having callable attributes of their own too:
we are discussing "beginners' Python" here).
Under the hood the interpreter looks for an attribute called "method" in
the instance. Failing to find it, it then looks in the instance's type.
Of course it can fail to find it there, too. If the type is defined as a
specialisation of some other type (a "subclass" of the other type -
"type2 is like type1 but with the following differences") then the
interpreter will consult the other type, and so on and so on. I am
deliberately ignoring multiple inheritance ("typeC is like typeB, and
it's also like typeA") here, but the essentials are the same. We say the
subclass "inherits" the attributes of the superclass.
If the interpreter keeps walking up the inheritance tree in this way
without finding the attribute it's looking for, it will eventually
arrive at the type "object", because ultimately every object in Python
is defined as a subclass of (a subclass of (a subclass of ...)) object.
So, when you wrote above "... int(1) and int(2) have exactly the same
operations, yes? Yet their values are different" you were correct: the
operations are defined by their type ("int"), and are shared between all
instances. Unfortunately you chose an immutable type - an int object's
value cannot be changed, and so it isn't exposed to CPython's
introspection features. It's used inside the type's methods, which are
written in C and can therefore sneakily access bits of the objects that
don't live in the Python-accessible namespace.
> 5. The (only?) way to get an object's value is to
> evaluate something (a name or a "reference"(*)
> that refers to the object.
>
Well, most times you don't really get "an object's value", since the
value is collectively embedded within all of its attributes, and as
we've just seen, instances of immutable types don't expose their values
directly. But it's not *incorrect* to say that, just a bit fuzzy.
> (*) I did not go back through this whole thread
> but I know "reference" is controversial. I am not
> trying to revive that debate. I mean the word in
> a very generic sense.
>
>
>>*how* the value is represented inside the object is completely irrelevant;
>
> ...snip...
> Yes, I agree and did not intend to imply otherwise.
>
OK, so are you ready for some fun? The dir() builtin gives you access to
the names defined in an object's namespace. Let's go hunting.
>>> from pprint import pprint
This is just a convenience so we don't splurge across the page.
>>> pprint(dir(object))
['__class__',
'__delattr__',
'__doc__',
'__getattribute__',
'__hash__',
'__init__',
'__new__',
'__reduce__',
'__reduce_ex__',
'__repr__',
'__setattr__',
'__str__']
>>>
These are attributes that by definition every object must have (since
when told to look for them the interpreter will keep going if necessary
until it finds them in object). We can find some information out about
them by constructing a string referring to them and evaluating that
string (it's easier than typing them by hand into the interactive
interpreter).
>>> for name in dir(object):
... print name, ":", eval("object. %s" % name)
...
__class__ : <type 'type'>
__delattr__ : <slot wrapper '__delattr__' of 'object' objects>
__doc__ : The most base type
__getattribute__ : <slot wrapper '__getattribute__' of 'object' objects>
__hash__ : <slot wrapper '__hash__' of 'object' objects>
__init__ : <slot wrapper '__init__' of 'object' objects>
__new__ : <built-in method __new__ of type object at 0x1E1AE868>
__reduce__ : <method '__reduce__' of 'object' objects>
__reduce_ex__ : <method '__reduce_ex__' of 'object' objects>
__repr__ : <slot wrapper '__repr__' of 'object' objects>
__setattr__ : <slot wrapper '__setattr__' of 'object' objects>
__str__ : <slot wrapper '__str__' of 'object' objects>
>>>
This tells us that object's type is "type" (or as we'd say colloquially
"object is a type", just as we'd say "dict is a type"). Most of the
other attributes are methods or "slot wrappers". You can regard them as
the same for our purposes, as the difference is essentially
implementation detail. Now let's look at an *instance* of type object.
>>> b = object()
>>> for name in dir(b):
... print name, ":", eval("b. %s" % name)
...
__class__ : <type 'object'>
__delattr__ : <method-wrapper object at 0x00AC3CF0>
__doc__ : The most base type
__getattribute__ : <method-wrapper object at 0x00AC3E90>
__hash__ : <method-wrapper object at 0x00AC3CF0>
__init__ : <method-wrapper object at 0x00AC3D10>
__new__ : <built-in method __new__ of type object at 0x1E1AE868>
__reduce__ : <built-in method __reduce__ of object object at 0x0099C438>
__reduce_ex__ : <built-in method __reduce_ex__ of object object at
0x0099C438>
__repr__ : <method-wrapper object at 0x00AC3E90>
__setattr__ : <method-wrapper object at 0x00AC3D10>
__str__ : <method-wrapper object at 0x00AC3CF0>
>>>
Here you can see that the instance is actually *wrapping* it's type's
methods. Again this is an implementation detail: the point of the
wrapper is to make the C-implemented method look like a function defined
in Python, I believe - I wouldn't claim exact certainty on that.
You can see that the __new__ method is actually directly inherited from
the type (it has the same memory address), but all the other methods and
slot wrappers of the type appear to have been wrapped.
This isn't coincidental: __new__ is the only method that doesn't *need*
to be wrapped, because it's called to *create* the instance, and so it
gets passed a reference to the *type*, not the instance.
Now let's define our own type. Because "classic classes" continue to
exist, it's still necessary at present to explicitly state we want a
type. The simplest way to do this is to subclass object (because
otherwise we'd have to discuss metatypes, and your would end up
swallowing your brain). I'm just going to define one method for this type.
>>> class foo(object):
... "Pretty simple object subclass"
... def foo(self):
... print "foo.bar:", self
...
>>> for name in dir(foo):
... print name, ":", eval("foo.%s" % name)
...
__class__ : <type 'type'>
__delattr__ : <slot wrapper '__delattr__' of 'object' objects>
__dict__ : {'__dict__': <attribute '__dict__' of 'foo' objects>,
'__module__': '
__main__', 'foo': <function foo at 0x00B2FCF0>, '__weakref__':
<attribute '__weakref__' of 'foo' objects>, '__doc__': 'Pretty simple
object subclass'}
__doc__ : Pretty simple object subclass
__getattribute__ : <slot wrapper '__getattribute__' of 'object' objects>
__hash__ : <slot wrapper '__hash__' of 'object' objects>
__init__ : <slot wrapper '__init__' of 'object' objects>
__module__ : __main__
__new__ : <built-in method __new__ of type object at 0x1E1AE868>
__reduce__ : <method '__reduce__' of 'object' objects>
__reduce_ex__ : <method '__reduce_ex__' of 'object' objects>
__repr__ : <slot wrapper '__repr__' of 'object' objects>
__setattr__ : <slot wrapper '__setattr__' of 'object' objects>
__str__ : <slot wrapper '__str__' of 'object' objects>
__weakref__ : <attribute '__weakref__' of 'foo' objects>
foo : <unbound method foo.foo>
>>>
You can see what foos inherit from the "object" superclass: almost
everything; but the foo type also has a few of its own attributes:
__dict__, __weakref__, __module__, __doc__ and bar.
__dict__ is a dictionary with several keys in it: __dict__ ,
__weakref__, __module__, __doc__ and bar. Oh, those are the class
attributes that aren't inherited from the superclass!
Right, let's create a foo instance and see what that looks like.
>>> f = foo()
>>> for name in dir(f):
... print name, ":", eval("f.%s" % name)
...
__class__ : <class '__main__.foo'>
__delattr__ : <method-wrapper object at 0x00B36A10>
__dict__ : {}
__doc__ : Pretty simple object subclass
__getattribute__ : <method-wrapper object at 0x00B36A10>
__hash__ : <method-wrapper object at 0x00B369B0>
__init__ : <method-wrapper object at 0x00B36950>
__module__ : __main__
__new__ : <built-in method __new__ of type object at 0x1E1AE868>
__reduce__ : <built-in method __reduce__ of foo object at 0x00B36A30>
__reduce_ex__ : <built-in method __reduce_ex__ of foo object at 0x00B36A30>
__repr__ : <method-wrapper object at 0x00B36A50>
__setattr__ : <method-wrapper object at 0x00B36A10>
__str__ : <method-wrapper object at 0x00B369B0>
__weakref__ : None
foo : <bound method foo.foo of <__main__.foo object at 0x00B36A30>>
>>>
Again you can see that the instance wraps the methods of its type (even
the ones its type inherits from type "object"). The instance also has
its own (empty) __dict__ dictionary, and a "bound method". The binding
is actually created dynamically every time the method is referenced, and
the point is to attach a reference to the instance to a reference to the
method. This instance reference is passed as the value of the first
argument (usually called "self") when the method is called.
Now let's set an attribute of f and see what changes.
>>> f.attribute = "three"
>>> for name in dir(f):
... print name, ":", eval("f.%s" % name)
...
__class__ : <class '__main__.foo'>
__delattr__ : <method-wrapper object at 0x00B36950>
__dict__ : {'attribute': 'three'}
__doc__ : Pretty simple object subclass
__getattribute__ : <method-wrapper object at 0x00B36950>
__hash__ : <method-wrapper object at 0x00B36A10>
__init__ : <method-wrapper object at 0x00B369B0>
__module__ : __main__
__new__ : <built-in method __new__ of type object at 0x1E1AE868>
__reduce__ : <built-in method __reduce__ of foo object at 0x00B36A30>
__reduce_ex__ : <built-in method __reduce_ex__ of foo object at 0x00B36A30>
__repr__ : <method-wrapper object at 0x00B36930>
__setattr__ : <method-wrapper object at 0x00B36950>
__str__ : <method-wrapper object at 0x00B36A10>
__weakref__ : None
attribute : three
foo : <bound method foo.foo of <__main__.foo object at 0x00B36A30>>
>>>
Well, we can see that the new attribute is included in the result of the
dir() function. But if you look carefully you'll also see that *it's
been added to the instance's __dict__ dictionary*.
Read this over a few times and you should have a pretty good idea of
what goes where. Now get on with writing that function!
regards
Steve
--
Steve Holden +44 150 684 7255 +1 800 494 3119
Holden Web LLC www.holdenweb.com
PyCon TX 2006 www.python.org/pycon/
More information about the Python-list
mailing list