More than you ever wanted to know about objects [was: Is everything a refrence or isn't it]

Thu Jan 12 22:12:00 EST 2006

rurpy at yahoo.com wrote:
> "Fredrik Lundh" <fredrik at pythonware.com> wrote in message
[...]
>>
>>you really have trouble with abstract concepts, don't you?
> 
> 
> Usually not when they are explained well.
> 
You want to be careful - the bot will get you! Don't be rude to the bot!
> 
>>*what* the value is is defined by the operations that the object supports (via its
>>type).
> 
> 
> Well, that is already better than what is in the Lang Ref.
> But there must be more to it than that.  int(1) and int(2)
> have exactly the same operations, yes?  Yet their values
> are different.  (By "operations" you mean the set of methods
> an object has?)
> 
He does. But "the value of an object" is a bit like saying "the value of 
a car": a car is a complex object, with properties (number of doors, for 
example, which can differ from car to car) and operations (like "start 
engine", "accelerate", "brake" and so on).

> This interest is not just idle mental calesthenics.
> I wanted to write a function that would dump the contents
> of any object (value and attributes), and got rather confused
> about values, types, repr's, etc.
> 
Well it's an ambitious project, but not an impossible one. Python has 
features for "introspection", which means that programs can examine the 
structure of the data rather than only dealing with values of known 
structure.

> It would help if you or someone would answer these
> five questions (with something more than "yes" or "no" :-)
> 
Well I've been staying away from this thread, but I'll try. For 
demagogic purposes I am omitting consideration of what are usually 
called "classic classes". You can talk about them once you understand 
the situation in more detail.

Note to nitpickers
------------------
Please note that I *am* oversimplifying here, and the nitpickers will 
undoubtedly find many threadsworth of valuable material here. The point 
is to develop an understanding of the *principles*. Once someone has 
that they can pick up the details as they need them, which 98% of Python 
users don't feel the need to. I don't even claim to know the whole 
story, but I *do* know enough to explain it in principle. So by all 
means correct me where I am demonstrably wring, but *please* don't just 
add complexity that doesn't aid understanding.

So sit comfortably, and we'll learn about object-oriented systems.

> 1. Do all objects have values?

All objects have a namespace. Each item in the namespace (attribute) is 
identified by (guess what) a name, and refers to a value. The value is 
also an object (which is why people sometimes say "in Python everything 
is an object"). You might even argue that an object is nothing *but* a 
namespace with specific behaviour, though that's a somewhat extreme view.

Each object is an instance of some type, and "knows" which type it's an 
instance of. (A reference to) the type is usually stored in the 
instance's attribute named (for historical reasons) __class__. Once you 
get really smart you'll learn that you can even the type of some objects 
(the ones whose types are defined in Python rather than being built into 
the interpreter) by changing the value of their __class__ attribute.

The instance's type is *also* an object, and therefore has its own 
namespace. The instance (of the type) is created by calling the type, 
possibly with arguments that can be used to initialise the values of 
items in the instance's namespace. They can also be ignored, but there 
wouldn't be much point requiring them then, would there?

> 2. What is the value of object()?

Well, object is a type. So calling it give you an instance of type 
object. But of course you could have answered this question for yourself 
in the interactive interpreter:

 >>> object()
<object object at 0x0099C438>
 >>>

Pay attention here:

 >>> object()
<object object at 0x0099C438>

No, I didn't just copy and paste the same text. I'm using the C Python 
implementation, and the first time I created an instance of type object 
I didn't bind it to a name. C Python keeps a count of references to all 
objects, and when the reference count falls to zero it reclaims the 
space. This was promptly re-used to create the second instance.

If I bind a name to a third object (which will *also* be created at the 
same address) the memory won't be reused:

 >>> a = object() # sets the ref count of the instance to 1
 >>> object()
<object object at 0x0099C448>
 >>>

> 3. If two objects are equal with "==", does that
>   mean their values are the same?

Almost universally, yes, although if you know enough about how the 
interpreter works "under the hood" you can define the response of 
instances of your own classes to the "==" operator (by defining their 
__eq__ method), and even define a class whose instances aren't equal to 
anything, even to themselves!

> 4. Are object attributes part of an object's type
>   or it's value, or something else?  (I think the first.)

Well, this is where things become a little more complex.

Each  instance of a type has some attributes defined in its own 
namespace ("instance attributes"), which are therefore unique to it. 
These attributes would normally be considered collectively as the 
"value" of an object if you really *must* have something that you can 
call the value. The attributes of an instance are accessible only by 
referencing the instance and then qualifying that reference - usually 
using the dot operator, but sometimes using functions like getattr().

You can, however, also use the instance to access the attributes of its 
*type*, and perhaps this is what is confusing you. Because these type 
attributes can be referenced via *any* instance of the type they are 
considered common to all instances of a given type. Again for historical 
reasons they are commonly called "class attributes".

The normal situation is that the instance attributes are used as the 
instance's value, and the class attributes are used as the methods of 
each instance of the type. This is why we say the type of an instance 
defines its behaviour.

When you write

   foo = instance.method(arg1, arg2)

you are almost always calling a function defined inside the instance 
type's class definition (though this being Python there's absolutely 
nothing to stop instances having callable attributes of their own too: 
we are discussing "beginners' Python" here).

Under the hood the interpreter looks for an attribute called "method" in 
the instance. Failing to find it, it then looks in the instance's type.

Of course it can fail to find it there, too. If the type is defined as a 
specialisation of some other type (a "subclass" of the other type - 
"type2 is like type1 but with the following differences") then the 
interpreter will consult the other type, and so on and so on. I am 
deliberately ignoring multiple inheritance ("typeC is like typeB, and 
it's also like typeA") here, but the essentials are the same. We say the 
subclass "inherits" the attributes of the superclass.

If the interpreter keeps walking up the inheritance tree in this way 
without finding the attribute it's looking for, it will eventually 
arrive at the type "object", because ultimately every object in Python 
is defined as a subclass of (a subclass of (a subclass of ...)) object.

So, when you wrote above "... int(1) and int(2) have exactly the same 
operations, yes?  Yet their values  are different" you were correct: the 
operations are defined by their type ("int"), and are shared between all 
instances. Unfortunately you chose an immutable type - an int object's 
value cannot be changed, and so it isn't exposed to CPython's 
introspection features. It's used inside the type's methods, which are 
written in C and can therefore sneakily access bits of the objects that 
don't live in the Python-accessible namespace.

> 5. The (only?) way to get an object's value is to
>   evaluate something (a name or a "reference"(*)
>   that refers to the object.
> 
Well, most times you don't really get "an object's value", since the 
value is collectively embedded within all of its attributes, and as 
we've just seen, instances of immutable types don't expose their values 
directly. But it's not *incorrect* to say that, just a bit fuzzy.

> (*) I did not go back through this whole thread
> but I know "reference" is controversial.  I am not
> trying to revive that debate.  I mean the word in
> a very generic sense.
> 
> 
>>*how* the value is represented inside the object is completely irrelevant;
> 
> ...snip...
> Yes, I agree and did not intend to imply otherwise.
> 
OK, so are you ready for some fun? The dir() builtin gives you access to 
the names defined in an object's namespace. Let's go hunting.

 >>> from pprint import pprint

This is just a convenience so we don't splurge across the page.

 >>> pprint(dir(object))
['__class__',
  '__delattr__',
  '__doc__',
  '__getattribute__',
  '__hash__',
  '__init__',
  '__new__',
  '__reduce__',
  '__reduce_ex__',
  '__repr__',
  '__setattr__',
  '__str__']
 >>>

These are attributes that by definition every object must have (since 
when told to look for them the interpreter will keep going if necessary 
until it finds them in object). We can find some information out about 
them by constructing a string referring to them and evaluating that 
string (it's easier than typing them by hand into the interactive 
interpreter).

 >>> for name in dir(object):
...   print name, ":", eval("object. %s" % name)
...
__class__ : <type 'type'>
__delattr__ : <slot wrapper '__delattr__' of 'object' objects>
__doc__ : The most base type
__getattribute__ : <slot wrapper '__getattribute__' of 'object' objects>
__hash__ : <slot wrapper '__hash__' of 'object' objects>
__init__ : <slot wrapper '__init__' of 'object' objects>
__new__ : <built-in method __new__ of type object at 0x1E1AE868>
__reduce__ : <method '__reduce__' of 'object' objects>
__reduce_ex__ : <method '__reduce_ex__' of 'object' objects>
__repr__ : <slot wrapper '__repr__' of 'object' objects>
__setattr__ : <slot wrapper '__setattr__' of 'object' objects>
__str__ : <slot wrapper '__str__' of 'object' objects>
 >>>

This tells us that object's type is "type" (or as we'd say colloquially 
"object is a type", just as we'd say "dict is a type"). Most of the 
other attributes are methods or "slot wrappers". You can regard them as 
the same for our purposes, as the difference is essentially 
implementation detail. Now let's look at an *instance* of type object.

 >>> b = object()
 >>> for name in dir(b):
...   print name, ":", eval("b. %s" % name)
...
__class__ : <type 'object'>
__delattr__ : <method-wrapper object at 0x00AC3CF0>
__doc__ : The most base type
__getattribute__ : <method-wrapper object at 0x00AC3E90>
__hash__ : <method-wrapper object at 0x00AC3CF0>
__init__ : <method-wrapper object at 0x00AC3D10>
__new__ : <built-in method __new__ of type object at 0x1E1AE868>
__reduce__ : <built-in method __reduce__ of object object at 0x0099C438>
__reduce_ex__ : <built-in method __reduce_ex__ of object object at 
0x0099C438>
__repr__ : <method-wrapper object at 0x00AC3E90>
__setattr__ : <method-wrapper object at 0x00AC3D10>
__str__ : <method-wrapper object at 0x00AC3CF0>
 >>>

Here you can see that the instance is actually *wrapping* it's type's 
methods. Again this is an implementation detail: the point of the 
wrapper is to make the C-implemented method look like a function defined 
in Python, I believe - I wouldn't claim exact certainty on that.

You can see that the __new__ method is actually directly inherited from 
the type (it has the same memory address), but all the other methods and 
slot wrappers of the type appear to have been wrapped.

This isn't coincidental: __new__ is the only method that doesn't *need* 
to be wrapped, because it's called to *create* the instance, and so it 
gets passed a reference to the *type*, not the instance.

Now let's define our own type. Because "classic classes" continue to 
exist, it's still necessary at present to explicitly state we want a 
type. The simplest way to do this is to subclass object (because 
otherwise we'd have to discuss metatypes, and your would end up 
swallowing your brain). I'm just going to define one method for this type.

 >>> class foo(object):
...   "Pretty simple object subclass"
...   def foo(self):
...     print "foo.bar:", self
...
 >>> for name in dir(foo):
...   print name, ":", eval("foo.%s" % name)
...
__class__ : <type 'type'>
__delattr__ : <slot wrapper '__delattr__' of 'object' objects>
__dict__ : {'__dict__': <attribute '__dict__' of 'foo' objects>, 
'__module__': '
__main__', 'foo': <function foo at 0x00B2FCF0>, '__weakref__': 
<attribute '__weakref__' of 'foo' objects>, '__doc__': 'Pretty simple 
object subclass'}
__doc__ : Pretty simple object subclass
__getattribute__ : <slot wrapper '__getattribute__' of 'object' objects>
__hash__ : <slot wrapper '__hash__' of 'object' objects>
__init__ : <slot wrapper '__init__' of 'object' objects>
__module__ : __main__
__new__ : <built-in method __new__ of type object at 0x1E1AE868>
__reduce__ : <method '__reduce__' of 'object' objects>
__reduce_ex__ : <method '__reduce_ex__' of 'object' objects>
__repr__ : <slot wrapper '__repr__' of 'object' objects>
__setattr__ : <slot wrapper '__setattr__' of 'object' objects>
__str__ : <slot wrapper '__str__' of 'object' objects>
__weakref__ : <attribute '__weakref__' of 'foo' objects>
foo : <unbound method foo.foo>
 >>>

You can see what foos inherit from the "object" superclass: almost 
everything; but the foo type also has a few of its own attributes: 
__dict__, __weakref__, __module__, __doc__ and bar.

__dict__ is a dictionary with several keys in it: __dict__ , 
__weakref__, __module__, __doc__ and bar. Oh, those are the class 
attributes that aren't inherited from the superclass!

Right, let's create a foo instance and see what that looks like.

 >>> f = foo()
 >>> for name in dir(f):
...   print name, ":", eval("f.%s" % name)
...
__class__ : <class '__main__.foo'>
__delattr__ : <method-wrapper object at 0x00B36A10>
__dict__ : {}
__doc__ : Pretty simple object subclass
__getattribute__ : <method-wrapper object at 0x00B36A10>
__hash__ : <method-wrapper object at 0x00B369B0>
__init__ : <method-wrapper object at 0x00B36950>
__module__ : __main__
__new__ : <built-in method __new__ of type object at 0x1E1AE868>
__reduce__ : <built-in method __reduce__ of foo object at 0x00B36A30>
__reduce_ex__ : <built-in method __reduce_ex__ of foo object at 0x00B36A30>
__repr__ : <method-wrapper object at 0x00B36A50>
__setattr__ : <method-wrapper object at 0x00B36A10>
__str__ : <method-wrapper object at 0x00B369B0>
__weakref__ : None
foo : <bound method foo.foo of <__main__.foo object at 0x00B36A30>>
 >>>

Again you can see that the instance wraps the methods of its type (even 
the ones its type inherits from type "object"). The instance also has 
its own (empty) __dict__ dictionary, and a "bound method". The binding 
is actually created dynamically every time the method is referenced, and 
the point is to attach a reference to the instance to a reference to the 
method. This instance reference is passed as the value of the first 
argument (usually called "self") when the method is called.

Now let's set an attribute of f and see what changes.

 >>> f.attribute = "three"
 >>> for name in dir(f):
...   print name, ":", eval("f.%s" % name)
...
__class__ : <class '__main__.foo'>
__delattr__ : <method-wrapper object at 0x00B36950>
__dict__ : {'attribute': 'three'}
__doc__ : Pretty simple object subclass
__getattribute__ : <method-wrapper object at 0x00B36950>
__hash__ : <method-wrapper object at 0x00B36A10>
__init__ : <method-wrapper object at 0x00B369B0>
__module__ : __main__
__new__ : <built-in method __new__ of type object at 0x1E1AE868>
__reduce__ : <built-in method __reduce__ of foo object at 0x00B36A30>
__reduce_ex__ : <built-in method __reduce_ex__ of foo object at 0x00B36A30>
__repr__ : <method-wrapper object at 0x00B36930>
__setattr__ : <method-wrapper object at 0x00B36950>
__str__ : <method-wrapper object at 0x00B36A10>
__weakref__ : None
attribute : three
foo : <bound method foo.foo of <__main__.foo object at 0x00B36A30>>
 >>>

Well, we can see that the new attribute is included in the result of the 
dir() function. But if you look carefully you'll also see that *it's 
been added to the instance's __dict__ dictionary*.

Read this over a few times and you should have a pretty good idea of 
what goes where. Now get on with writing that function!

regards
  Steve
-- 
Steve Holden       +44 150 684 7255  +1 800 494 3119
Holden Web LLC                     www.holdenweb.com
PyCon TX 2006                  www.python.org/pycon/