[Types-sig] Why have two hierarchies? (EUREKA!)

Sun, 06 Dec 1998 21:35:14 +0100

At 23:38 98/12/04 +0100, Just van Rossum wrote:
[...]
>A proposal for a flat object hierarchy (a.k.a. "The Earth is Flat")
[...]

That's very much along the lines that I had in mind.

I have some observations:

1) Attribute retrieval can be defined much cleaner. You currently use
different rules to search the bases then you use to search the object. If a
base-object is an object, why not use the standard search system.

Proposal:

Getting an attribute is defined recursively. There are many ways to do
this, and I don't know if the one I've chosen is the best one. (No doubt
this needs more study.) I have split the functionality of getting an
attribute of the object itself, and getting an attribute for one of your
derived objects. This just seems a clean way to do things. I can imagine
that you might want to change the get_attribute behaviour of one object
without changing it for all derived objects too. (I've used my own name to
avoid clashes with existing ideas.)

ntf_get_attribute( object, name ):
	'''This is the standard get-an-attribute function.'''
	getfunc = ntf_get_attribute_raw( object, '__get_attribute__' )
	if getfunc != None:
		return getfunc( name )
	else: 
		return ntf_get_attribute_raw( object, name )

ntf_get_attribute_raw( object, name ):
	if object.__namespace__.has_key(name):
		return bound_attribute( object, object.__namespace__[name] )
	for i in range( len( object.__bases__ ) ):
		base = object.__bases__[i]
		base_get = ntf_get_attribute( base, '__get_derived_attribute__' )
		if base_get!=None:
			rebind, attr = base_get( name, object.__bases__[i+1:] )
			if attr != None:
				if rebind
					attr = rebind_attribute( object, attr )
				return attr
		else:
			attr = ntf_get_attribute( base, name )
			if attr != None:
				return bound_attribute( object, attr )
	return None

The first function provides the __get_attribute__ hook, which can be
provided either by the object itself, or by one of its bases. 

The second function does the actual searching. (It should probably be
available to any __get_attribute__ function to let it do its work.) It
searches first in the local namespace. If it finds an entry for the desired
name, it returns a bound attribute. If there is no local entry, the bases
are searched in turn. The exact details are not that important, but note
that each of the bases is only queried using the standard attribute access
mechanism. This ensures that the bases behave consistently, and that
attribute access doesn't change when you do it as a base class.

As an extra refinement, the 'get_derived_attribute' method (if present)
gets the rest of the 'bases' list. This allows a base in the list to
control how the other bases are used to create the object behaviour. We can
now create an object which I shall call 'Class' such that:

	class X( Y, Z ): [...]
is equivalent to
	object X( Class, Y, Z ): [...]

and the Class base can take care of the way in which the attributes of
object X  depend on the attributes of objects Y and Z.

I've tricked around a bit with the return value of get_derived_attribute.
It returns an extra boolean that specifies whether the attribute should be
re-bound to the derived object. This supports the run-time flexible version
of static class methods.

As specified this access rule is rather inefficient. It will probably
require some form of optimisation, which I will discuss later.

2) Why the named/unnamed distinction?
In your proposal you distinguish named and unnamed objects; one returns
bound methods and one unbound ones. Why not return bound methods all the
time? If you really want to, you can always re-bind it, or delete the
binding. You should certainly not use the named/unnamed distinction for
this. If you really need it, you should have a bound/unbound flag
somewhere. To get the full flexibility you would want to specify the
binding of each attribute separately, not for all the attributes in an
object in one go.

If you always return bound attributes, we might be able to get rid of the
entire idea of global things. After all, global items in a program are
basically bound to the program-invocation-object. (In this view, you don't
start a program, but you instantiate an object derived from the program
object. Thus, there can be several program invocation simultaneously.)

3) It is probably a good idea to put any named base objects in the local
namespace by default. The X.__init__ function wants to be able to call
self.Y.__init__, and thus self.Y must resolve to the right base object. We
can no doubt think of a couple of alternative ways of solving this, but
this seems the easiest.

4) I have not made a distinction between methods and other data elements.
All attributes that are returned are bound to an object from which they
came. Is this how Python works at the moment?

I think the ideas above support all the wishes that I have seen on this
mailing list (at least, those that I could understand). Delegation should
be straightforward. The biggest problem is that this system is more than
enough rope for anyone to hand herself.

Compatibility:
I'm mainly thinking about Python 2, but by choosing different special names
I think it can be made fully compatible with the existing Python 1.5.

Efficiency:
A straightforward implementation would probably be inefficient, some form
of optimisation is needed.

For the overwhelming majority of all programs, base objects don't change
their own behaviour during the lifetime of an object. 

A simple caching scheme is the following: Suppose we have object X, with
bases A, B, and C. If an attribute of X is retrieved from base B, the
resulting attribute is cached in X. The base object B is informed of this
fact, and keeps a link to X. Whenever that (or any) attribute of B is
changed, object X is informed and the cached reference removed. As the same
scheme is applied to all levels of the hierarchy, this will work without
problems.

No doubt we can all think of many other ways to improve the efficiency.
Global program analysis and optimisation is one technique that is extremely
powerful, but I'll leave that to another mail.

Niels

============================================================
Niels Ferguson, vorpal@xs4all.nl, voice/fax: +31 20 463 0977
PGP: 50E0 CBE2 3F19 C17D  FBE3 FFA5 38B7 BBB5

[Types-sig] Why have two hierarchies? (*EUREKA!*)

[Types-sig] Why have two hierarchies? (EUREKA!)